4 Ways to Get Unsampled Google Analytics Data in Tableau/
July 20, 2016
Tableau makes it simple to connect directly to Google Analytics through both their native Google Analytics connector and Google Sheets Web Connector. In fact, if you are dealing with a non-Premium web property that typically does not experience sampling, I recommend testing and using these two methods:
Both of the above methods retrieve the data using the Google Analytics Core Reporting API Version 3. This is important to know because when using the API, data has the risk of being sampled even if you are pulling from a Premium/360 property. Google has confirmed that future versions’ sampling levels will match the interface, but for now there are some workarounds using other useful, Premium-only features.
Many of these options can be found in Dan Wilkerson’s post about unsampled reporting, but this post will be specific to working in Tableau.
As a refresher, sampling happens in Google Analytics when ad hoc data include more than 500,000 sessions to the property for Google Analytics and 25,000,000 for Google Analytics 360 (formerly Premium). Standard reports (without any manipulation) that you can view by clicking on the left-side main navigation are not sampled because the data are in pre-aggregated tables, which means they’ve already been calculated.
The initial reports are already processed, but if you add a secondary dimension or advanced segment means that some recalculation has to happen. When this recalculation is triggered, that’s when sampling can occur. If you are wondering how all of this affects the accuracy of your data, our Data Evangelist Jonathan Weber has an great explanation.
Solution #1 – Standard Reporting or Including Sampling
This solution doesn’t really count as a workaround but I wanted to include it since the first step is understanding if the reports are being sampled at all. If I were to choose to pair the dimension Page with the metrics Pageviews or Source with Sessions, there wouldn’t be sampling. The goal would be to pair dimensions and metrics that are from standard reports in Google Analytics.
This is a good start if you are creating broad dashboards and general overviews, but it isn’t the most attractive option and doesn’t allow for in-depth reporting on specific segments of your audience. The second part of this first solution is to simply include the sampling data in the Tableau reports.
Similar to the Google Analytics interface, the API allows you to retrieve the boolean of whether it’s sampled or not, the number of sessions that the report is based on and the percentage of total sessions that the report includes in the sample. Since an automatic alert will not pop-up in Tableau if the data is being sampled, I highly recommend adding this to your Tableau dashboard. It’s vital that anyone consuming your reports know how sampled they are when it comes time to make important decisions or insights. To make this easier, I included an example here.
Add Sampling Amount to Tableau Dashboards
To get this data, we can first use the Google Sheets API to pull our report. For this example, I pulled the number of users per page from January to the end of June. Since this is not from a pre-aggregated table (I can’t click any of the standard reports to see these dimensions and metrics together) and it will include more than 500,000 sessions, I can expect it to be sampled.
After the report is run, create a new tab that references the results including the sampling levels that appear near the top. The first image below is the automatically generated report and the second is the new tab that is organized with column headers.
In Tableau, after using the Web Connector as the Data Source and entering “http://tableau.github.io:80/webdataconnector/Examples/GoogleSheetsConnector.html” into the URL, you should have the option to select the tab that you just created in the Google Sheet.
The Tableau table can be created by dragging Measure Names to the Rows shelf and Measure Values to the table as a text mark. You can then drag Measure Names again to the Filter box to just see the Measure Names you want (screenshot below). Now this can be added to any dashboards that you create in the Tableau workbook.
The next solutions will be specific to Google Analytics 360 (formerly Premium) so if you currently use this version of the suite, there are more powerful options to bypass sampling in Tableau.
Solution #2 – Custom Tables
When using the Core Reporting API or connectors that use the API like Tableau, I mentioned above that using dimensions and metrics from standard pre-aggregated reports will not be sampled. If you’re familiar with the Custom Tables feature, this should give a hint to a useful workaround.
Custom Tables are created at the property-level and can be used to design a table of any combination of dimensions and metrics. This table will be pre-aggregated like the standard reports, which means that it will be faster and unsampled in the interface with an added bonus of an increased limit of rows (200,000 instead of 75,000). It also gives you the ability to use the unsampled data in these dimensions and metrics through the API.
If I set up a Custom Table that included Users by Page (from the example above), it would automatically be unsampled if I pull the data through the API either using Tableau, Google Sheets or other connectors!
Solution #3 – Report Exports
This solution doesn’t involve the API and isn’t the best if you are dealing with reports that should be pulled or updated daily, but it is great for large datasets that span a wide date range. Since Tableau offers the option to connect to a spreadsheet, you can simply export the data you wish to build a report around from the interface. Plus, the Unsampled Report feature allows for reporting on any number of sessions and up 3 million rows of data!
Solution #4 – BigQuery Connector
Having Google Analytics 360 also gives you the ability to tap into an important resource: BigQuery. You can choose to link BigQuery to your Google Analytics 360 property, which means that the raw data is available in a cloud-based data warehouse to clean, join or query at will. This is pre-processed data and separate from the interface and API, so we don’t have to worry about any sampling. Knowing how valuable this tool is, Tableau has created a connector just for BigQuery. Note: BigQuery must be linked and live before moving to the next step.
In Tableau, choosing ‘Connect to Data’ or adding a new Data Source will show a menu of options for files and servers. Google BigQuery should be listed under servers, or you can use the search box to find it.
You should then be prompted to sign in to Google. Make sure to log in with the email address that has access to the BigQuery project that you are interested in connecting to. After choosing the dataset, you’ll notice that a list of tables shows up below- these are the full, raw tables by day. Since we want a specific date range and we certainly don’t need all of the dimensions and metrics, we’ll choose “New Custom SQL.”
The good news is that the queries that you may be used to building for BigQuery can be used right here in the Tableau interface. In fact, you can test your query in the BigQuery UI and copy-and-paste into Tableau when you are ready (just make sure to remove any tabs and spacing).
If you’re new to BigQuery and interested in learning to build queries, this post is a helpful start.
Using reporting tools outside of Google Analytics doesn’t necessarily mean you have to deal with sampling. There are features for Google Analytics 360 that allow you to report and work on large datasets in or out of the interface either by simply exporting spreadsheets, creating your own pre-aggregated tables or working with the raw data in BigQuery.
If you are using the standard version of Google Analytics, Tableau is still a powerful visualization and analysis tool- the most important takeaway is to include the sampling level using a simple Google Sheets connector.