Upcoming LunaMetrics Seminars
Seattle, Nov 3-7 New York City, Nov 17-21 Washington DC, Dec 1-5 Los Angeles - Anaheim, Dec 8-12

GA: Why do pages refer to themselves?

Content - Navigation

Content - Navigation

About a week ago, I read a post by Avinash that answered GA questions; but when I got to the part about the navigation report (see screen shot, left), I just didn’t agree. The question was, “Navigation summary question – why is previous and next page often the same as the page you are viewing? ” Like this report on the left: Notice that 6.23% of pages that lead to the index page come from the index page, and 6.23% of pages that come from the index page go to itself. A little strange, no?

Why I was suspicious of the original answer.

In his post, Avinash wrote that someone at GA explained what caused this peculiar beharior. Here is how he described it — basically, it is about viewers that look at a regular tagged page and then look at a picture on the page in larger format (which isn’t tagged). Here is the example he gives:

Visitor Action One (view): /avinash/2007/09/rethink-web-analytics-introducing-web-analytics-20.html
Result: javascript hit generated (data collected)

Visitor Action Two (click): http://www.kaushik.net/avinash/wp-content/uploads/2007/09/web_analytics_1.0.png
Result: NO javascript hit generated (no data collected)

Visitor Action Three (back): /avinash/2007/09/rethink-web-analytics-introducing-web-analytics-20.html
Result: javascript hit generated

Visitor Action Four (click): http://www.kaushik.net/avinash/wp-content/uploads/2007/09/web_analytics_2.0.png
Result: NO javascript hit generated

Visitor Action Five (back): /avinash/2007/09/rethink-web-analytics-introducing-web-analytics-20.html
Result: javascript hit generated

To Google Analytics (or any other Analytics tool), it will look like this:

1) /avinash/2007/09/rethink-web-analytics-introducing-web-analytics-20.html - javascript hit generated

2) /avinash/2007/09/rethink- web-analytics-introducing-web-analytics-20.html- javascript hit generated

3) /avinash/2007/09/rethink-web-analytics-introducing-web-analytics-20.html - javascript hit generated

</Avinash>

This sounded plausible, but too neat. Much too neat for me. What if someone got to one of those pictures – one of those untagged .png pages – and decided to leave the site altogether? If just a single person bailed out, that would make the percentages different. In order for this explanation to work, every single person would have to exhibit the identical behavior – they would all have to look at two pictures and come back to the same page. It has to be perfectly symmetrical, and it is in the hands of thousands of humans to do it the same way.

Do you believe that? I didn’t. But I didn’t know the answer.

The Truth According to John (aka Google Analytics Gang Signing)

So yesterday, I was working with John and Jonathan here at LunaMetrics. “Did you see Avinash’s post a week ago?” I asked them, “Those numbers are WAY too clean. How could a page refer to itself and then refer to itself again every single time?”

John thought to himself for a couple of minutes and then said, “Oh, I get it. Here is what happens. Whenever the page is viewed twice in a row – like a page reload — the whole thing automatically works.” He put his hands together in the configuration on the left. Jonathan nodded wisely. I looked at them like they were nuts.

But ultimately, I understood what he meant:

If a page precedes itself, it also follows itself. That’s what John meant with his fingers — on one side of the report, we see a page preceding itself, on the other side of the report, we see the page following itself. It is just the same story, told twice.

The key is, you can’t think of that report like a clickstream when it involves the same page more than once. Once you stop thinking about it that way, it becomes intelligible. The page is the same no matter which of the columns of the report it appears in, and the numbers have to match exactly because of that.

Still lost? I know that some of you are sitting there nodding your heads, while others are saying, “What is she talking about?” So for the latter crowd, let me describe it in a different way. I hope you won’t mind if I use numbers instead of percentages, just to make this clearer.

Let’s say that Page A refers to itself via a page reload 100 times. And let’s say that the website has only one page — Page A. The report would look like this — in a conceptual way:

Notice how we get 200 pageviews in the middle of the page (and we know that that’s how many there are.) Notice how the number of pageviews on the left and on the right are symmetrical. And notice how these are two identical pictures, which meet in the middle — just like the picture of John’s hands above.

So I think I have run out of ways to explain this problem. It is sometimes caused by a reload, and sometimes caused by part of the explanation that Avinash gave. But it never requires thousands of people to exhibit the identical behavior.

And in closing, John wanted me to show off that he is really known for his good looks and not for his gang signs, so here is he is.

Robbin

Robbin Steif

About Robbin Steif

Our owner and CEO, Robbin Steif, started LunaMetrics ten years ago. She is a graduate of Harvard College and the Harvard Business School, and has served on the Board of Directors for the Digital Analytics Association. Robbin is a recent winner of a BusinessWomen First award, as well as a Diamond Award for business leadership.

http://www.lunametrics.com/blog/2008/07/29/ga-pages-refer/

19 Responses to “GA: Why do pages refer to themselves?”

Steve says:

Nice pic John! But I have to ask, what does the gang sign mean? ;-)

Cheers!
Steve
PS Robbin, yeah the rest of the article was fine. :-P

Aysberg says:

Either I don’t get it or you guys think too complicated. I always figured those numbers being page reloads. In this example: 6.23% of the users reloaded the page, the rest clicked somewhere or left. But these numbers make me wonder: Why do visitors reload pages (static HTML pages)? On some customer sites the number is a lot higher than on others.

Rachel says:

This was so helpful. I have been wondering about this for ages. Duh. Makes sooo much sense now! I heart you guys.

pratt says:

John always brings the sexy!

Thanks for including the “still lost?” section, I was one of those people, but now I get it!

Also, the link you have to Avinash goes to New York City Conference Centers, is that intentional?

Robbin Steif says:

Darn, I filled out this whole response to you, Aysberg, and lost it all by forgetting to do the spam protection field. Worst practices in conversion.

It is hard to understand. I really gave it my best shot. Maybe someone else will try. I will nag some of my friends and see if they are willing to give it a shot…. Robbin

Robbin Steif says:

ach, how did I screw up that link (no wonder Avinash didn’t write me this morning, I was going to write him and say, “well???” Will fix, Many thanks. (Of course, I will send John the comment about how he brings the sexy…)

cw360 says:

You might also call this an echo.

In light of this, my view is that a page shouldn’t show up in it’s own previous page report or next page report because it doesn’t make sense to include page refreshes in the percentage calculations for previous pages. Or, perhaps have a checkbox that allows the user to decide whether or not to include page refreshes. Although I’m not sure of the value of including page refreshes in the previous/next page distributions.

Robbin says:

Dustin — I just used the refresh as an example that is easy to understand. It could be lots of things — like half of Avinash’s example, that would do the same job.

cebuimage says:

I am fans of Avi, Are you sure about this?

Robbin says:

It’s not a matter of being a fan. He is great. But I knew that what he wrote wouldn’t work, you would have to get EVERYONE to do it the same way. So if you don’t believe me, you should go do a test. Here is how: Create a new profile for your website. In firefox, type javascript:pageTracker._setVar(‘test’);

Then create an include filter for that profile that only allows users with the user defined variable test. Then for one page, do it the way I described, e.g. look at ONE picture that isn’t coded, or do ONE reload. And the next day, do it again, and again, because WA aren’t perfect and if you just do it once, it might not pick up your activity (but it probably will.)

And then go look at your web analytics. And report back here!! (If you can.)

Markus says:

Another possible explanation for the “reloads” would be a link to an another website. From page A You click on a link to http://www.foo.com. After some time you push the back-button and come back to page A.

Estetik says:

Also, the link you have to Avinash goes to New York City Conference Centers, is that intentional?

Brian Katz says:

3 Important points in descending importance order:

1) Is the pic of John just there to test Avinash’s theory? :)

2) I think the misleading part of understanding Navigation is to say that pages “refer” to their “Next” pages.
There may, in fact, be no referring between consecutive pages.
Eg: from Home Page, visitor opens, say, a Privacy Policy page in a 2nd tab/window. change to the tab with the home page and click the News link. GA will show the flow as being from Home -> Privacy -> News but Privacy has no links other than to the Home page.
To really appreciate the point think of on virtual page name “referring” to a subsequent virtual page name!!!

3) There are 2 situations – 1 where the % for a page is identical in both lists and the other when it isn’t
It follows from Robbin’s impeccable reasoning (with which I entirely and respectfully concur), different %’s *must* have a different cause. It may point to a page not being tagged, a page that takes too long to load, etc

Brian

GA will show the flow as being from Home -> Privacy -> News but Privacy has no links other than to the Home page.

Kirill says:

I use SWFAddress for Flash and GA (pagetracker setpage functoin) for site navigation like this:

mysite.com#/page1 -> mysite.com#/page2 -> mysite.com#/page3 -> …

and percentage of “reloads” is very high (~ 20%).

I can’t explain this number of reloads… Why could it be?

it’s really helpful, thank u! love the photo of John and your cool blog!

You might also call this an echo

MrConners says:

I have a problem with the Navigation Summary showing me what I believe to be impossible. My main landing page has a radomized selection code to go to either indexa or indexb. When viewing the navigation summary for either index page the oposite accociated index page shows up in the previous and next page sections. There are no links that would allow this to happen and refreshing the page will only reload the index page you are in.

The Exit % is also reading 0% which I see in another google post as a current emerging problem.

My numbers read as thus:
For indexa
/indexb previous%clicks 12.78
/indexb next%clicks 18.45

For indexb
/indexa previous%clicks12.54
/indexb next%clicks22.66