Upcoming LunaMetrics Seminars
Seattle, Nov 3-7 New York City, Nov 17-21 Washington DC, Dec 1-5 Los Angeles - Anaheim, Dec 8-12

Custom Filters for GA, Part 4: Custom Advanced Filters

Custom Advanced filters are so cool, and there is so little information available about them. It’s too bad that they have such an intimidating name.

I find myself using them in three ways:

  1. To rewrite stuff in GA. Usually, this it to rewrite a URI. For example, we work with a site where the CMS insists on calling the same page three different URIs, and we are using custom advanced filters (among other things) to rewrite them, so that we always know what page we are looking at.
  2. To associate data that aren’t already associated. For example, Benjamin in SF wrote me and asked how to associate a referring source with a transaction ID. This is a job for Superman Custom Advanced Filters.
  3. To change all sorts of other things (but which are mostly about #1 and #2.)

So let’s look at a really easy Custom Advanced filter — rewriting all your URIs to be Title Tags. (True, you can already see them in the Content Performance > Content by Title report, but this will rewrite them for every report.)

I want to teach two things before I start.

Thing #1: When I wrote about Regular Expressions, I explained how a dot means, match any one character. And I wrote about how a star means, match zero or more of the previous items. So when you put them together, they mean, match everything.

Thing #2: When you use parenthesis, it create a variable in Regular Expressions. Most of the time, I don’t care. But it really matters in Custom Advanced filters.

Putting Thing #1 with Thing #2: When you write this: (.*) it means, get everything and put it in a variable.

OK, now we are ready to start. Check out the screen shot below

First, I gave it a friendly name (“Rewrite URI, etc.”) Then I chose Custom filters, then within custom, I chose Advanced. As soon as I chose Advanced, I got all the other options below it.

Today, we are ignoring the middle set of boxes, the ones that say “Field B” and are just dealing with Field A and output. So everytime I talk about the A stuff, I am referring to the boxes that say, Field A –> Extract A.

Now, let’s sit back for a moment and think about what we’re doing before we do it. Our goal is to get the page title and to rewrite it — to reconstruct it — so that it shows up everywhere that Request URI might. So instead of seeing URI’s (urls – you can all fight about the correct way to say that), we’ll see page titles.

To do this, we first choose Page Title as Field A (just like we chose filter fields in this post that I wrote last weekend. You have to decide, what are you working on?) Then we extract it — we create a Regular Expression(RegEx) that describes it. In this case, our RegEx is (.*), i.e. get everything and put it in a variable (like I described early in this post.)

Next, we decide where we are going to put it. We want to output it to the URI.

Now, here is the magic (or at least, that’s the way it felt to me as I went through life, trying to understand what $A1 or $B3 was.) The first variable (the first set of parenthesis) in the –> Extract A field is called $A1. We only have one variable in this screenshot, but if we had a second one, it would be $A2. $A3 for the third one (if we had one), and so on. So when we use $A1 as our constructor, it means, use the first variable (.*) in the extract A field to reconstruct our URI.

I know that was confusing, so let me say it another way. Here’s what we did. We took the title tag, and rewrote it as a Regular Expression in the A field. The expression we used was (.*), i.e. get everything and put it in a variable. (So that means, we put the whole title tag in a variable.) Then we told the constructor fields to take the Request URI and rewrite it to be the first A variable — which is now defined as the whole title tag. Consequently, all URIs get rewritten as their page’s title tag.

Please comment if you didn’t understand anything. (I’m serious. I got on someone else’s blog today and said, I just don’t understand.) Or send me email to my last name at my company name dot com.

Robbin

Robbin Steif

About Robbin Steif

Our owner and CEO, Robbin Steif, started LunaMetrics ten years ago. She is a graduate of Harvard College and the Harvard Business School, and has served on the Board of Directors for the Digital Analytics Association. Robbin is a recent winner of a BusinessWomen First award, as well as a Diamond Award for business leadership.

http://www.lunametrics.com/blog/2007/05/04/custom-filters-for-ga-part-4-custom-advanced-filters/

16 Responses to “Custom Filters for GA, Part 4: Custom Advanced Filters”

Rob van Tol says:

Well I understood it, but thanks for taking the time to s p e l l i t o u t … does seem a complex way of making something quite simple work. It feels a little like working with magic, but it works, and I’d of never of worked out from the Google Analytics interface or help. So big thanks for this and the whole series.

Nice Explanation Robin. This was a big help.
Thanks

enormously helpful explanation, thank you!
One thing I still don’t understand is why the $ symbol is used here. In other expressions it’s and ending anchor, but seems like it represents something totally different in this context.

Robbin says:

Yes. it is something totally different here. It means, Go up to the Extract Field (and then specifically, because it is $A1), it means, go up to Extract Field A and grab the first variable. In this example, it is the only variable in Extract Field A.

Remember that the constructor field is not sensitive to Regular Expressions.

[...] Custom Advanced Filters explained by Robbin Steif [...]

[...] Custom Advanced Filters explained by Robbin Steif [...]

[...] Custom Advanced Filters explained by Robbin Steif [...]

doug says:

Really nice explanation. The paragraph that begins “Now here is the magic” is particularly good–as soon as i read through it once i realized you were talking about a ‘backreference’ and that made everything click into place.

Robbin Steif Robbin Steif says:

Hi Doug, glad you like. I wrote this post just over three years ago, and it still gets comments…

Joe says:

Great explanation, thank you. Can I use this filter to change the name of only certain pages by indentifying them in the regular expression or does this jsut specify which par t of the page title will be put into the URL field?

Robbin Steif Robbin says:

@Joe, you can use this filter to do (almost) anything. You can use it to do both the things in your comment. You can use it to concatenate someone’s city and state and put that back in the city field (or the state field). It is mostly limited by the fields in the backend of GA – for example, there is no field for “custom variable slot 1″ so you can’t use it to mess with your custom variables.

Neeraj says:

Really great explaination robin

Anu says:

Robin,

Nice post.. I have a quick question. My site is a subscription site, and I am catching username in custom variable. I would like to exclude traffic from certain usernames in my reporting (these users are internal traffic and are skewing the data). Unfortunately, I can’t exclude them on the basis of IP address so I have to use custom variable “username” to exclude this traffic. How can I do it? I have tried using few things, thought advanced filters may be a way to go.

Thanks!
Anu

Robbin Steif Robbin Steif says:

You can exclude with an Advanced Segment and the custom variable. The other way is more problematic: Custom Variables still aren’t in the back end, so if you created a filter and profile, you have to use an old fashioned User Defined Variable

Rob says:

Hi Robbin,
I’m trying to extract a query string parameter (A or Aff_ID) from the landing URL that will be pushed into the campaign source value. I’ve set up the following rule but nothing is happening. The RegEx is definitely correct. Has it something to do with the ordering of the custom filters?

Advanced Filter 1:
A = Request URI =
(.*\?|.*&)[Aa]([Ff][Ff]_[Ii][Dd])?=([^&]*)
Constuctor = Campaign Source = $A3

Advanced Filter 2:
This extracts the subdomain from the hostname and RequestURI and pushes them to Custom Field 1.
A = Hostname = ^(subd1|subd2)\.
B = RequestURI = (.*?)(\.x?html)?
Con = Custom Field 1 = $A1$B1

Advanced Filter 3:
This extracts two URL query parameters from the Request URI and pushes them to Custom Field 2.
A = Hostname = ^(subd1|subd2)\.
B = RequestURI = (.*?)(\.x?html)?
Con = Custom Field 2 = $A1$B1

Advanced Filter 4:
This combines the two above filter and ovewrites the Resuest URI.
A = Custom Field 1 = (.*)
B = Custom Field 2 = (.*)
Con = RequestURI = $A1$B1

The advanced filters 2,3,4 are all working fine. None of the campaignsID’s are appearing in the reports though.

Any help of understanding to why this isn’t being captured would be greatly appreciated.

Thanks,

Rob

Frank says:

Hi Robbin, love your blog, helped me a lot already. I know this is an old post, but still relevant as filters in some cases stay a mystery :) I’m encountering something similar to Rob.

I set up filters in various profiles and accounts where campaign data is manipulated due to landing page URL, referrer, source or similar. But Google Analytics seems to have problems with attributing e-commerce data (and other conversion data?) correctly, and there seem to be inconsistencies between what standard reports show vs. what reports with segments applied show, too (might be hard to reconstruct without seeing the reports).

Example: If a landing page contains a certain parameter I overwrite all campaign values (source, medium, campaign, content, term). I see traffic for this newly set source but no e-commerce conversions. But if I apply segments for a) visits that had this parameter in the landing page, and b) for the newly set source they both show data (not exactly the same, but close).

Is this a known issue? What could cause this?

Maybe you have an idea. Thanks! :)