New Google Analytics Data Sources

The Google Analytics Query component comes with some Data Sources out of the box. These Data Sources have certain Google Analytics fields in them. Each Data Source also comes with the option to add in other fields in a comma separated list called Dimensions or Measures. However, there may be cases where the dimension values in the comma separated list legitimately contain commas. Examples where this occur may include the AdContent or AdKeywords columns. In order to bring this information into a new table, a custom schema is required.

 

Creating a Custom Schema

Using a python script, a hidden function in the Google Analytics driver can be called to create a new custom schema for the Data Source property. This function requires 5 arguments:
  1. tableName: this the name of the new Data Source to be created
  2. description: this is a simple description of the Data Source
  3. dimensions: a comma separated list of the Google Analytics Dimension values to be included in the Data Selection for the Data Source
  4. metrics: a comma separated list of the Google Analytics Metric values to be included in the Data Selection for the Data Source
  5. profile: the Google Analytics profile id. Please note this should match the profile id used in the Connection Options in the Google Analytics Query component.
 

In the attached job, these arguments can be changed on lines 9 to 13 in the Python Script component.

Note: For Azure users, tomcat8 should be changed to tomcat in the path used above.


When the component is successful, it will print Success=True and then the new schema file name and location underneath.

The script can either be run in the component by clicking on the Run button or the Python Script component can be run. The script will only need to be run once per schema file to be created.

 

Using the Custom Schema in the Google Analytics Query component

Once the schema has been successfully created in the step above it can be used in the Google Analytics Query component:

The schema will not automatically appear in the Data Source dropdown until a Connection Option called Location is added:

Note: For Azure users, tomcat8 should be changed to tomcat in the path used above.

The new schema can now be selected in the dropdown.


As well as the fields specified in the Python Script, a Dimensions and Metrics field will also be available to select from the Data Selection which will return a comma separated list of any additional dimensions or metrics requested.

When the Google Analytics Query component is run, a new table will be created in with the requested data included.

The job used in this article is available for download below. Please be sure to choose the correct job for your platform (_RS for Redshift, _SF for Snowflake).

Please Contact Matillion Support for the RTK to use the jobs.