Attribution and Google Analytics/
December 23, 2014
Log in to Google Analytics and have a look at the Acquisition reports, and you’ll find all kinds of data on how people get to your site. Ever wonder where that comes from, and how GA decides what the source, medium, or campaign values are? Wonder no more, because here we’ll de-mystify the rules.
The Source/Medium Rules
The basic dimensions that GA uses to describe where someone comes from are Medium and Source (along with Campaign, Keyword, and Ad Content where circumstances warrant). GA fills these in based on different sources of information, and there’s a specific order in which Google Analytics looks for this information:
- AdWords auto-tagging (Medium = cpc, Source = google)
- Manual campaign tagging (Medium & Source whatever you specify)
- Organic search engines (Medium = organic, Source = search engine)
- Referrals (Medium = referral, Source = referring site domain)
- Direct (Medium = (none), Source = direct)
Let’s explore what falls into each category, and the pieces of information GA uses to decide.
1. AdWords auto-tagging
If you have linked together your Google Analytics and AdWords accounts, this automatically enables a setting in AdWords called “auto-tagging”. AdWords autotagging automatically adds a tracking parameter to the end of all your ads’ destination URLs. The parameter is called the “gclid” (Google Click ID) and enables AdWords and Analytics to establish the connection between the ad click and the visit to your website. If your normal destination URL is:
Your auto-tagged URL will look something like this:
(You won’t see this in your ads, as it is only added to ads when they are served up on search results pages.)
When Google Analytics sees landing page URLs with the gclid parameter, it sets the Medium to cpc, the Source to google, and the Campaign, Keyword, and Ad Content to the corresponding information from AdWords (campaign name, bid keyword, and ad headline, respectively).
Linking your accounts together is an easy process, but to do it, you’ll need a login (email address) that has both administrative access to your AdWords account as well as Edit permission to the GA property to which you’re linking. Here’s a recent article about how to link your AdWords account to Google Analytics.
By default, AdWords auto-tagging overrides any manual campaign tagging (see #2 below). However, you have a choice in the property settings in Google Analytics to do it the other way around if you would want to for any reason (typically it’s a bad idea, because it prevents data from lining up in your GA reports with impression & click data from AdWords).
2. Manual campaign tagging
For marketing and advertising other than AdWords, you can manually add campaign tracking parameters to your landing page URLs. You can use campaign tracking for channels such as display ads, email marketing, social posts, etc.
With manual campaign tagging, instead of a “gclid” parameter like with AdWords, you add a parameter for each of Medium, Source, and Campaign (plus Keyword and Ad Content, if needed). Manually tagged URLs might look something like this:
With manual campaign tags, GA simply uses the values you specify. In this example, the Medium is email, the Source is newsletter, and the Campaign is 2014-12-08-Holiday-Sale.
So for this to work, you of course have to actually include these tags in your destination URLs for marketing and advertising. Here’s a really comprehensive introduction to campaign tagging from Annielytics.
3. Organic search engines
OK, so #1 and #2 used clues in the URL of the page itself to say what someone’s Medium and Source should be. For the remaining categories, however, GA will use a different source of information: the URL of the page a user came from (a value called the referrer).
The referrer value is part of the HTTP protocol on which the web is built, and tells your browser (most of the time) what page someone came from previously when they click a link to go to another page. If you use your browser’s developer tools to view the HTTP headers of your web page, you’ll actually see the referrer value in there (assuming you came from a link on another page).
So for our third category, organic search engines, GA looks at that referrer URL to determine if it is from a known list of organic search engines. If it’s on this list, GA sets the Medium to organic, the Source to the name of the search engine, and the Keyword to the search query (if provided).
The default list probably covers most of the bases as far as search engines you need, but in some circumstances you might want to augment this list. (For example, there may be some country-, language-, or industry-specific search engines not in the list that are relevant for your site). Fortunately it’s easy to add organic search sources in your property settings.
If the referrer URL isn’t on the list of known search engines, GA classifies it as a referral from a website. In this case, the Medium is referral and the Source is the domain of the referring site.
Notice that, in the Medium and Source dimensions, GA doesn’t differentiate different types of referrers, such as social media sites. Channel groupings (see below) can help us further categorize and customize the way we see our acquisition data in GA, including separating social from other referral traffic.
Finally, if the referrer value is not present, GA classifies the traffic as direct. In this case, the Medium is (none) and the Source is direct (which I always sort of think is backwards, but they didn’t ask me when they assigned the labels).
We typically interpret direct traffic as someone simply typing in the URL of our site, using a bookmark, or copying and pasting a URL — that is, they already knew the URL they wanted to get to, and simply used it directly.
However, there are also a few other scenarios that cause a missing referrer value (and so show as direct in GA). These include:
- If a user traverses from a secure (HTTPS) page to a non-secure (HTTP) page, browsers don’t pass the referrer value for security reasons. No referrer value, so the traffic is classified as direct.
- If a user opens a URL from an application outside their browser (such as an email or social media app), it’s basically as if they had simply copied and pasted that URL into their browser. No referrer value, so the traffic is classified as direct. (This is why it’s a really good reason to use campaign tagging for your social posts.)
The other important thing to know is that direct traffic never overwrites a previously known Medium and Source for the user. That is, if a user comes to the site by an organic search, then comes back a week later directly, GA remembers that they came by organic search and leaves their Medium as organic. The rationale here is basically, let’s not replace some information with no information (since direct is basically no information about where someone came from).
By default, GA remembers a user’s Medium and Source for 6 months, but you can change this with a property setting in GA if you like to.
Beyond Medium & Source
Medium and Source are the basic acquisition dimensions, but GA also has Channel Groupings built on top of these to allow us to further categorize and customize the labels to our site and marketing activities.
GA includes a Default Channel Grouping that includes some basic categories that mostly align with a lot of the common Medium values (Paid Search, Organic Search, Direct, Email, etc.). One major difference is that the default channel grouping breaks out social traffic from other referrals with a channel called Social.
You can go even further and create Custom Channel Groupings to apply your own rules. For example, you might want to do things like:
- add a channel called Affiliate for links from affiliate sites
- split your paid search into channels for branded and non-branded, based on rules about campaigns or keywords
- split your display campaigns into prospecting and remarketing, based on rules about campaigns or sources
- split your social traffic into paid, owned, and earned, based on rules about mediums or campaigns
Basically, you can customize the channel groups to better reflect the categories you’d like to use to think about your marketing and advertising channels. Here’s a guide to channel groupings from SEER.
Where It All Goes Sideways
If Google Analytics is missing any of the information it uses to determine Medium, Source, and other traffic dimensions, it’s going to get it wrong. Specifically, if tracking parameters (AdWords or manual campaign tracking) are missing, or if the referrer value is missing, we’re losing information.
How does this happen? In a word: redirects.
Redirects aren’t all bad — in fact they are very good, in that they serve many important functions in ensuring consistency in URLs, ensuring that links aren’t broken when pages move, etc. But you need to be careful about how you redirect one URL to another.
First of all, you should avoid using client-side redirects, which actually load a page in your browser that then tell your browser to go somewhere else. The problem with client-side redirects are that, by inserting an extra step, we lose track of the original referrer:
External source Page A -> Page B with redirect -> Page C
When we get to Page C, Page B is our referrer — not Page A, which is what we really want. (Client-side redirects are also bad for SEO, so analytics tracking isn’t the only reason you want to avoid them.)
Instead, you want server-side or 301 or 302 redirects. With this type of redirect, your web server simply sends on the browser to a different URL, while passing along the referrer information. Great!
You also need to make sure that any campaign parameters on the original URL are also included in the redirected URL. If you can’t see them, neither can Google Analytics.
How exactly you implement redirects on your site is particular to your web server and/or content management system — consult your webmaster to figure out what type of redirects you are currently using and whether they are sending along all the information Google Analytics needs.