Find Related Products Purchased Together in Google Analytics/
March 28, 2016
If you have an ecommerce website and you’re interested in learning more about which products are purchased together most often, you might want to take advantage of this rarely talked about feature with Google Analytics.
This post assumes you already have either Universal Enhanced Ecommerce or Classic Ecommerce transaction tracking configured for your website and that you are familiar with the Core Reporting API.
Related Product Dimensions and Metrics
According to Google’s documentation, “Google Analytics can automatically generate a list of related products per product for your Ecommerce enabled property based on transaction data. You can use this data to improve product bundling, merchandizing, remarketing, and email campaigns.”
To give you a few ideas, this data may help you:
- Make better product recommendations to your users
- Take advantage of opportunities to upsell
- Test the user experience impact of showing related products on various product pages
- Find opportunities to remarket to users who purchased one item but may also be interested in another
- And more!
If you are using Enhanced Ecommerce and you track impressions of a Related Products section on your website with Product Lists Impressions and Clicks, this is not the same thing.
Product Lists are configured by you and are tracked with product impression and product click tracking (see the documentation here if this is what you’re looking for). The dimensions and metrics we’re referring to in this post are calculated by Google Analytics behind-the-scenes, and they do not appear in standard or Enhanced Ecommerce reports.
So how do they do it?
The Correlation Coefficient
To understand how this data is created, you first have to understand the following formula:
Just kidding… All of this is handled by Google Analytics (thank goodness!). While that means we cannot see the exact calculation that Google Analytics uses for generating this data (it occurs behind-the-scenes), we understand that generally they use your transaction data to calculate a correlation coefficient for potentially related products for any given product.
If it’s been awhile since your last stats course – here’s a quick refresher! The correlation coefficient is a statistical measure of the strength and direction of a linear relationship between two variables. Possible values range from between +1 and -1, where:
- 0 represents no linear relationship
- A positive relationship is represented as the value becomes closer to 1; as one variable increases in its value, the other variable also increases
- A negative relationship is represented as the value becomes closer to -1; as one variable increases in its value, the other decreases
So for example, if you have two products in your query and their correlation coefficient value is 0.8, that represents that there is a strong correlation between the two products. Conversely, if the value is 0.3, there is a very weak correlation between the two products.
When you access Related Product information via the Core Reporting API, the correlation coefficient calculated for a given product and each potentially related product will help you determine how strong the relationship is between those products.
The following table shows some example data for various product ID’s, sorted by strongest correlation coefficient (Correlation Score):
|Queried Product ID||Related Product ID||Correlation Score|
There are additional dimensions and metrics available as well, as shown later in this post.
Accessing the Data
In order for Google Analytics to calculate and generate this data for your property, you need to have transaction tracking configured. It does not matter if you use the standard Ecommerce or Enhanced Ecommerce, though ideally we’d recommend Enhanced Ecommerce with Universal Analytics.
Google Analytics Interface
You will need to enable Ecommerce in your View(s) within the Ecommerce Settings tab. Additionally, you will need to enable the Related Products feature here as well. To do so, follow these instructions:
- Go into your preferred View’s Ecommerce Settings
- You should already have Ecommerce enabled and configured for your website
- Enable Related Products
You must have at least 30 days’ worth of data in order to generate the Related Products calculations.
Core Reporting API
After you have completed the steps to enable the Related Products feature (and you have collected at least 30 days’ worth of data), you will be able to request the data via the Core Reporting API. This is the only way to get this information – it is not available in standard reports or custom reports within the interface.
You may have a preferred method of doing this, and it will vary based on how you intend to use the information. If you’d like to report on the recommended products, you might use a tool like Google Sheets, ShufflePoint, Tableau, or R. If you hope to output the results on your website, you’ll use whatever language your website is built using, and query the API that way. Either way, you will always need to query the Core Reporting API.
The available Related Product dimensions and metrics are as follows (according to their API name):
*Required for all queries, in addition to at least one Related Product metric. Do not include other product dimensions.
But you don’t have to stop there – you can also specify segments and filters just as you would for creating any other report via the Core Reporting API. For example, you might find it useful to look at your ecommerce data as a whole, and then compare products purchased together most often by particular segments. To give you a few ideas, you might ask yourself:
- Is there a stronger correlation between particular products or product groups purchased by account holders or members compared to non-members?
- Are call center employees adequately up-selling other relevant products?
- Are users from your social media campaigns more likely to purchase related products or certain types of related products than users from paid search?
- Does device category have any impact on the strength of correlation between products and related products in purchases (are users any less likely to purchase related products when on a mobile device)?
To give you a very basic example that you can build on, the following query is for the metrics Correlation Score, Queried Product Quantity and Related Product Quantity for the dimensions Queried Product ID and Related Product ID:
Remember, you can test all of your queries using the Google Analytics Query Explorer. This is a great way to test that everything is correct before beginning to pull it into another application!
Below is a screen shot of the results:
Taking it Further
There are quite a few ways to find this information on your own. Your ecommerce platform may even offer this type of information for you already. The advantage to this feature is that, assuming you have enough data available, you can query these metrics and dimensions with the Core Reporting API (and evaluate them in regards to other data available in Google Analytics) – rather than having to resort to pulling your ecommerce data and doing the calculations yourself in another platform, such as R or Python.
That said, if you have background in R for example and you want to see how statistically significant your results are (or you simply want to see how similar your results are compared to Google’s behind-the-scenes calculations), I encourage you to check out a couple of these resources:
- Association rules in R (remember Becky’s helpful guide to pulling Google Analytics data into R!)
- RGA Package for accessing data in R from the various Google Analytics API’s
Additionally, we have a great case study on analyzing Related Product data – I encourage you to check it out to see how informative this kind of data can be.