Upcoming LunaMetrics Seminars
Boston, Oct 6-10 Chicago, Oct 20-24 Seattle, Nov 3-7 New York City, Nov 17-21

Say Goodbye to Exclude Filters in Google Analytics!

Use Google Tag Manager to Protect Your DataNo need to panic, Exclude Filters aren’t going away anytime soon! However, after reading this post you may not WANT to use Exclude Filters as frequently.  There are still many valid reasons why you may need to set them up, but when possible – it might be time to eliminate them.

Just as a quick refresher, you can use the Exclude Filters in Google Analytics to block traffic data from certain sources from showing up in a particular view.  Sometimes these are used to partition data into one view or another, for example, think of creating separate views for Internal or External Traffic.  For these use cases, Filters work beautifully. You can filter based off of IP Address, Hostname, Service Provider, etc…

But then there are those occasions where you want to block out traffic completely.  Just as easily, you can set up an Exclude filter for each of your views, and poof! The data has disappeared!  Except, it hasn’t really.

The raw, unfiltered data is stored by Google Analytics. Anytime you request a more specific report, either by adding a secondary dimension or applying an Advanced Segment, GA will go back to that unfiltered data to rerun your report, then filtering the results for your specific view.  Why is this important?

Even filtered data will increase sampling in your reports.

If you try to run a non-standard report that includes more than 250K visits/sessions, Google Analytics will use a sampling algorithm. So all of this raw data you think you’re filtering out is actually going to continue to stick around and cause sampling issues for you. Click here for more information about how sampling works with Google Analytics.

In cases where you want to filter out traffic completely, my advice is to stop sending that data to Google Analytics.

The answer here is Google Tag Manager. If you’re not familiar with Google Tag Manager, we’ve got a few articles on our blog that will help get you up to speed. Long answer short, we love it here and it will change your (analytics) life.

Let’s go over how data gets from your website into Google Analytics. In a standard implementation, using Google Tag Manager and Google Analytics, the following order of steps happen.

    1. Visitor Reaches Your Page
    2. Google Tag Manager Loads
    3. GTM Executes a Google Analytics Tag
    4. Pageview Is Sent to Google Analytics
    5. Google Analytics Receives the Pageview
    6. GA Applies Any Filters for Each View against the Pageview
    7. Pageview Shows Up In Your Analytics

Rather than waiting until the Pageview hits the filter to decide that it’s not what we want, I propose we recreate the Exclude Filtering rules using custom Macros and Rules in Google Tag Manager.  If we can determine that we don’t want a particular visitor to show up in our Analytics, let’s not even serve them the Google Analytics tag. We’ll use Google Tag Manager as a shield in this situation, to keep that data out of our analytics entirely.

If there’s a reason you want to see this data in the Google Analytics interface, let’s say from a development site or from internal traffic, once your rules are created you can always serve it a different Analytics tag or use a dynamic Macro to change the UA number, so that it won’t count against your main property’s reporting limits.

Note: It will still be helpful to have Exclude Filters on your account as a backup. This blog post will help keep traffic from reaching your account, but it won’t stop traffic from any sites that aren’t loading your Google Tag Manager.

So let’s get down to business. Two of the more popular Include/Exclude Filters are based on Hostname or on IP Address.

Using Google Tag Manager to Exclude by Hostname

First up, we can create a Macro/Rule in Tag Manager that we can use to block anything that isn’t our main site. For this example, let’s assume that our site has the following subdomains:

  • www.mysite.com or mysite.com
  • dev.mysite.com
  • admin.mysite.com

We only want the traffic to our main site to be sent into Google Analytics. We can do this through a simple Macro and a new Rule.

Create the “url hostname” macro

URL Hostname Macro

1. In Tag Manger, click the New button and choose Macro to start a new macro.

2. Name this Macro “hostname”

3. Select URL as the type of Macro.

4. Select Host Name as the Component Type.

5. Save!

Create the Rule – “Traffic – Bad Hostname”

Bad Hostname Rule

In Google Analytics, we have the option of using Include or Exclude. We could do the same thing here, but I’ve found that it’s much easier to create this rule as a standalone rule and use it to Block tags from firing. If you wanted to, you could add it to all the Rules that control firing, but then you’re duplicating the same logic over and over.

I’m calling this macro “Traffic – Bad Hostname” as a catchall for any hostname that isn’t www.mysite.com or mysite.com.

1. Click the New button and choose Rule to start a new rule.

2. Name this Rule “Traffic – Bad Hostname”

3. Choose the “url hostname” macro from the drop down list.

4. In the next box, choose the “does not match RegEx” option

5. In the next box, enter the following regular expression. This accounts for the fact that sometimes my site can be loaded with or without the www.

^(www.)?mysite.com$

Use that Rule to Block Tags from Firing

Block By Hostname

Now that we know this traffic isn’t the general public coming to our main site, we can safely block certain tags from firing. In this case, that will be the basic Google Analytics tracking tag.

1. Open the tag you want to block with this rule.

2. Under Blocking rules, click Add

3. Select the “Traffic – Bad Hostname” rule

4. Save!

Using Google Tag Manager to Exclude by IP Address

Now that we’ve set up our Exclude by Hostname capabilities in Tag Manager, let’s look at IP Address. This one is a little harder to implement and will require some server-side logic. By default, IP Addresses aren’t available through javascript on a webpage. Depending on the platform your site is running on, this should be fairly easy to implement. You’ll need to look up the appropriate way to get the visitor’s IP address depending on how your site is configured. Once you get that IP address, then it’s time to write it to the dataLayer.

Google Tag Manager utilizes something called the dataLayer to pass information from the page into Tag Manager.  For more information, check out this post about unlocking  the dataLayer. The idea here is that you collect any sort of information about the visitor, the session, etc… and write it on the dataLayer when the page loads. When Google Tag Manager loads, it can see this information and using Macros, use this information to for Rules and other Macros.

For the IP Address Exclude, I’ve created a more complicated Macro to allow you to identify multiple types of visitors by their IP addresses. This will let you use these types of visitors in Firing and Blocking rules.

Add IP Address to the dataLayer

The dataLayer needs to be loaded ABOVE the Google Tag Manager snippet, so that this data is available when Tag Manager loads. You can use something like the following code:

<script type="text/javascript">
 dataLayer = [{
 'visitorIP': '155.55.155.155'
 }];
 </script>

Create the visitorIP Macro

Visitor IP Macro

1. In Tag Manager, click the New button and choose Macro to start a new macro.

2. Name this Macro “visitorIP”

3. Select Data Layer Variable as the type of Macro.

4. In the Data Layer Variable Name, enter whatever you called the IP Address. In my case, this would be visitorIP.

Create the visitorType Macro

Visitor Type Macro

1. In Tag Manger, click the New button and choose Macro to start a new macro.

2. Name this Macro “visitorType”

3. Select Custom Javascript as the type of Macro.

4. Paste in the following code.

function() {
 var newIP = '{{visitorIP}}';
//INTERNAL TRAFFIC
 var patt = new RegExp("^28\.28\.128\.1[0-9]?$");
 if(patt.test(newIP) == true) {return('internal');}
//MONITORING SYSTEM
 patt = new RegExp("^65\.65\.65\.12[0-9]$");
 if(patt.test(newIP) == true) {return('monitoring');}
return('external');
 }

5. Alter the code to suit your needs.  Regular expressions will come in handy here, just like in Google Analytics Filters.

6. Save!

Create the Rule – “Traffic – Not External”

 

Traffic Not External

Again, I’ll create a rule as catch-all for everything that is not our “external” visitors that we want to track. It might be helpful at some point to create special rules for different visitor types, but in this case, I’ll lump them all together.

1. Click the New button and choose Rule to start a new rule.

2. Name this Rule “Traffic – Not External”

3. Choose the “visitorType” macro from the drop down list.

4. In the next box, choose the “does not equal”

5. In the next box, enter the word “external”

Use that Rule to Block Tags from Firing

Block by Not External

If we can use IP Addresses to identify visitors as not the general public or to identify traffic that we want to block completely, we can use this rule to block any tag from firing.

1. Open the tag you want to block with this rule. In this example, we’ll block Google Analytics from even firing.

2. Under Blocking rules, click Add

3. Select the “Traffic – Not External” rule

4. Save!

General Conclusions

So there you have it, two ways to use Google Tag Manager to shield your Analytics data from unwanted traffic. To leave with a few extra points, I’d like to point out that having this sort of capability inside of Google Tag Manager can extend way beyond just blocking the Google Analytics tag from firing. Consider the following scenarios:

  • Surveys/Pop-Ups –  If you’re using a Tag to launch 3rd party surveys or ads, chances are you’re receiving some sort of report about how well they’re performing. It would make sense in this case to block these tags from even firing for Internal Employees, which should help improve your response/clickthrough rate.
  • E-Commerce – With your new Macros and Rules, you can effectively block any test transactions from Internal Traffic from ever getting to Google Analytics.
  • Separate Properties – Sometimes you want to see the data going into Google Analytics to make sure everything is working properly. Consider using a test property with a different UA number for all testing/QA. This way your test data isn’t contributing to increased sampling and you can safely monitor your test cases. You can use these Macros to block one tag from firing, and the trigger another tag to fire instead.

Last note of caution – if you have a high traffic site and you’re constantly hitting sampling, this fix may not do much to help your situation. Consider Google Analytics Premium which increases the amount of visits you can include in your sampled reports, as well as giving you the option to request unsampled custom reports. Contact us if you’re interested in more information about Google Analytics Premium.

Jon Meck

About Jon Meck

Jon Meck is a Digital Analytics Engineer working with Google Analytics and Google Tag Manager. He is an Excel enthusiast and strong proponent of the "Work Smarter, Not Harder" mantra. He has a history of working with web technologies for companies large and small. Outside of work, Jon enjoys late nights working on pet projects, running races, and spending time with his family.

http://www.lunametrics.com/blog/2014/03/11/goodbye-to-exclude-filters-google-analytics/

8 Responses to “Say Goodbye to Exclude Filters in Google Analytics!”

Xavier says:

Hi Jon, very interesting post

Yep, sampling will apply to filtered profiles as well, but by creating them (let’s say a filter by device type) we precisely avoid having to segment data, so what’s the business case of using GTM to not gather data at all when we can use filters and still keep it?

I hope the question was clear

Jon Meck Jon Meck says:

Hi Xavier,

There are scenarios where Exclude Filters can be useful, such as segmenting data into different filters. In that regard, keep on using Exclude or Include filters!

My article is targeting cases where you want to eliminate data altogether. In those instances, I would suggest using GTM to help make your data collection smarter, rather than relying on Filters to clear out the bad data.

Hope that helps!
-Jon

frios says:

Jon,

I’ve been looking for this type of solution, thank you. We get hit with sampling when accessing more than 30 days of data.

@Xavier, this is particularly helpful when visitors use your website to access a login portal. Visitors will come to our home page just to click ‘login’ which counts as a session. When you’ve got hundreds of thousands of sessions getting created, excluding data all together is extremely helpful.

udi says:

Hi Jon,

I am confused. According to Google “session sampling occurs at the property level, not the view level. For ad-hoc queries, the sample set of 250K sessions is determined at the property level, and then the view-level filters are applied. As such, views that are filtered may have fewer sessions included in the sampled calculation.”

This seems to suggest that sessions matching exclusions criteria set at the view filter level do not increase the total session number.

Am i missing something?

Udi

Jon Meck Jon Meck says:

I’m not sure I follow your question exactly. Keep in mind that data is collected at the Property level. Views are just that, different ways to look slice and dice the same data.

Let me give you an example though – let’s say you have a View that filters out all mobile traffic.

When you run an ad-hoc query, like applying a secondary dimension or advanced segment – GA will go back to the property level and picks out 250,000 sessions to use as the base for your report. Those sessions may be desktop/mobile/whatever.

Then, using those sessions, it then applies your view filters. So now the total number of sessions you’re using has gone down.

Then it runs its sampling calculations.

Hope that helps! Also note that it’s now possible to get up 500,000 sessions before you get sampling.
-Jon

udi says:

oh, i understand. so by creating exclusion rules in GTM, we avoid sending those sessions to the Property altogether.

thanks,
Udi

Jon Meck Jon Meck says:

Exactly! We either avoid sending those session anywhere or we use Rules to send them to a DIFFERENT property.

Hirthanu says:

I have created new rule and macro to exclude unwanted traffic from certain website, but to fire the rule? My doubt is which Tag I need to select to exclude unwanted traffic? Please reply

Leave a Reply