Upcoming LunaMetrics Seminars
Los Angeles - Anaheim, Sep 8-12 Washington DC, Sep 22-26 Boston, Oct 6-10 Chicago, Oct 20-24

Archive for the ‘Google Analytics’ Category

Logging Raw Google Analytics Data using Keen IO


Google Analytics export to BigQuery is great for getting at the raw session-level data of Google Analytics. But, it’s only for GA Premium (GAP) subscribers. If you have other reasons to need GAP – like increased sampling limits, DoubleClick integration, or additional custom dimensions — and you have the money to spend, GAP is a great option.

Raw GA data?

But what if you’re not a GAP subscriber? Can you still get the raw, session-level data?

In a word: no (at least not from GA). All of the data in GA reports and in its associated reporting APIs is aggregated data. You can create and export reports full of dimensions and metrics, but there’s no report that can give you all of the information for each session the way BigQuery can.

We can do this

Fortunately, there’s an alternative: send the raw data into a repository where it’s easily accessible. Especially if you’re using Google Tag Manager, it’s pretty easy to fire an additional tag at the same time using the same rules as Google Analytics.

There’s a third-party tool that’s perfect for this kind of logging: Keen IO. Keen IO is intended for gathering unstructured event data from any source you like: websites, games, devices. I say “unstructured” because you are free to send any kind of data you like; there are no requirements except a timestamp and whatever set of properties you’d like to record.

Keen IO is a subscription software service. You don’t have to worry about any of the details of how the data is stored or where, and you pay based on the number of events you send per month (up to 50,000/month is free, so it’s easy to try out with no commitment, and it scales at reasonable prices from there). You can send data using a really simple REST API and JSON for the data, or there are ready-to-go SDKs in a variety of languages (including JavaScript, which is good for us in collecting web data).

There are also simple APIs for querying or exporting the data for analysis. Keen’s query APIs are fairly limited (compared to BigQuery, for example), but for moderate volumes of data (like the number of hits within the non-Premium GA limit of 10 million/month), they’re absolutely fine. Keen doesn’t have any built-in reports, so you’ll be pulling queries or extracting data to use with another tool for analysis or visualization (just like BigQuery).

(Would I recommend that you only use Keen IO and drop GA altogether? Definitely not. They’re both great at what they do. Keen IO fills a gap in GA, but it doesn’t recreate all the functionality of GA.)

How it works

  1. Keen IO JavaScript library and configuration. First, you’ll have to sign up for a (free) Keen IO account. Then we can use Keen IO’s JavaScript library by including the following script in your pages:
    <script type="text/javascript">
      !function(a,b){if(void 0===b[a]){b["_"+a]={},b[a]=function(c){b["_"+a].clients=b["_"+a].clients||{},b["_"+a].clients[c.projectId]=this,this._config=c},b[a].ready=function(c){b["_"+a].ready=b["_"+a].ready||[],b["_"+a].ready.push(c)};for(var c=["addEvent","setGlobalProperties","trackExternalLink","on"],d=0;d<c.length;d++){var e=c[d],f=function(a){return function(){return this["_"+a]=this["_"+a]||[],this["_"+a].push(arguments),this}};b[a].prototype[e]=f(e)}var g=document.createElement("script");g.type="text/javascript",g.async=!0,g.src="https://d26b395fwzu5fz.cloudfront.net/3.0.7/keen.min.js";var h=document.getElementsByTagName("script")[0];h.parentNode.insertBefore(g,h)}}("Keen",this);
      var keenTracker = new Keen({
        projectId: "your_project_id",
        writeKey: "your_write_key"

    This just loads the library and sets up a tracker object (here I’ve called it keenTracker) with the project ID and write key that Keen supplies you when you sign up.

  2. Tracking pageviews, events, and any other interactions. To log something to Keen, we can use the following script:
    keenTracker.addEvent("www.example.com", keenData);

    Of course, we’re going to have to fill in that data bit (it takes a JSON object) but we’ll get to that in a second. If you’re using GTM (which will be by far the easiest way to do this, rather than hard-coding in your pages), you’ll probably want to create one or more tags with the code from step 1 plus the code from step 2, that then take various macros and information from your dataLayer to fill in the data (which we’ll get to next).

  3. Leveraging Google Analytics & Tag Manager data. So how are we filling in all of this data? First off, let’s leverage all the things GA or GTM can tell us. GA has a client ID (a unique ID for the device which is used to count users), as well as a bunch of information about the size of the screen and other details we might be interested in. You can access these properties by using the get command in GA, like this:
            "clientId": tracker.get("clientId")

    Here I simply pushed the value to the dataLayer, since it will be easy to grab in GTM. Besides clientId, there’s a slew of system details you may want to grab like screen size and so on – the field reference for GA gives all the names of these properties. Note that these values won’t be available until after GA does its thing, so you want to use tag priority to manage their order.

    Besides the stuff GA knows, GTM can also get the value of items such as the URL of the page, the referrer, the values of cookies, and more. Use whatever you need!

    Then we’ll likely want to fill all this in for the Keen data using some macros (from the dataLayer or anywhere else you like). In your tag, before the addEvent command, let’s set up the data:

    var keenData = {
        'clientId': {{Client Id macro}},
        ...and so on...

    You may need a couple of different kinds of tags (one to match the types of data in GA pageviews, another in GA events). You’ll want to set them up on the same rules as your GA tags in GTM.

    (Keen even gives a full recipe for capturing pageviews, although we’re short-circuiting some of the steps by leveraging information that already exists in GA & GTM.)

  4. Enrich data with Keen’s add-ons. Keen also has some add-ons that help enrich the data. For example, it can capture the user agent (browser type and version) and parse that into its various pieces. The most important of these is for geo-location by IP address. If you capture the IP address in the data you send, you can have Keen add a city, state, country, and postal code. That’s something that GA also does, but it happens after the data is sent, so we’ll want to replicate this in our Keen data. More documentation on Keen’s add-ons here.

Reading the data

Once the data is in Keen IO, it’s just a big list of all the properties you sent. You can query it using Keen IO’s read APIs, which let you do a variety of types of queries from simple counting and summing, to filtering and funnel analysis.

This process is a little different from BigQuery, in that Keen uses a simple REST API with parameters to define the query rather than a SQL-like query syntax. Many of the most common tasks can be accomplished through this API, but note that you can also use the API to extract an entire data set to another tool (BigQuery, even, or to an analysis or visualization tool like Tableau or R).


There’s lots of additional logic you could use to enhance this data and make it even better:

  • Implement a cookie to sessionize events on the client side.
  • Process GA campaign tags contained in URL query parameters into the data.
  • Leverage GA’s new tasks feature to improve your Keen IO data collection. For example, you could abort the event if the page is rendering in a “Top Sites” preview in Safari, or check whether cookies are enabled.

Other tools

Snowplow Analytics is similar to Keen IO, and is available both as software-as-a-service, as well as open source software for running your own data warehouse on top of technologies like Amazon S3 and Redshift for storage. It’s a solid solution, and in some ways even better suited than Keen IO to this problem, albeit a bit more complicated to set up since it relies on underlying Amazon Web Services. Definitely worth taking a look at if you are seriously considering a solution like this one.

BigQuery itself recently started supporting streaming data import (rather than batch jobs), but given the way authorization works in BigQuery, it’s not really appropriate for client-side tracking. If you were sending data from the server-side, it would be a possibility.

Whatever tool you choose, it should be obvious that the power of Google Tag Manager and one of these data logging tools in combination can empower you to collect raw interaction data for research and analysis.

Kick Off 2015 with a LunaMetrics Training


As we look towards the end of this year and the beginning of 2015, consider how a training in Google Analytics, Google AdWords, or Google Tag Manager may help your career! Choose from seven different cities in the first quarter, ranging from Boston to San Francisco, with stops in Chicago and Denver along the way.

With trainings in cities around the country, we hope you can find a location that is easy to travel to and fun to explore!

Whether you’re just starting out in a new field or looking to get a deeper understandable of the tools you’re currently using, we have a class for you.

Learn how to better collect and analyze your data with our Google Analytics series, futureproof your website with the flexible Google Tag Manager, or drive qualified traffic to your site through paid search with our Google AdWords trainings.

Choose an option below to learn more about the specific topics we cover and decide which trainings would be right for you!

Google Analytics Google AdWords Google Tag Manager


Access 404 Error Metrics Using Google Tag Manager


As analysts and marketers, we always want to track positive performance metrics and conversions in Google Analytics. However, tracking errors is also important to monitor the health of your site and keep track of signals indicating a negative user experience.

Accessing this data gives us a better idea of what’s causing users to get lost and wander into the dark, unattached voids of your domain. Knowing where these problem spots are makes it easier to fix internal links or set redirects.

I’ll show you different ways to view where people are hitting these error pages and where they are coming from, either through your existing setup or by using Google Tag Manager to fire events or virtual pageviews. (more…)

Understanding Bot and Spider Filtering from Google Analytics


On July 30th, 2014, Google Analytics announced a new feature to automatically exclude bots and spiders from your data. In the view level of the admin area, you now have the option to check a box labeled “Exclude traffic from known bots and spiders”.

Most of the posts I’ve read on the topic are simply mirroring the announcement, and not really talking about why you want to check the box. Maybe a more interesting question would be why would you NOT want to? Still, for most people you’re going to want to ultimately check this box. I’ll tell you why, but also how to test it beforehand. (more…)

Easy Cohort Analysis for Blogs and Articles


It’s now easier than ever to track and compare performance between articles and blogs. While Google Analytics shows you pageviews and other key metrics, frequent content comparisons are made difficult by the shifting time frames.

How can I compare a blog post that was published this month vs. a blog post that was posted last month? Sure, we can run two different reports, pull it into Excel and start crunching the numbers, but there’s gotta be a better way!

blog-cohort-applesEnter Cohort Analysis. You may have heard this term thrown around before, usually in relation to users on your site and when they first became users. The idea here is to group users or sessions into common groups, like who first visited in January or first-month visitors. Avinash and Justin Cutroni both love cohorts, so obviously we should, too!

In this case, we’re going to use Google Tag Manager to put content into cohorts so we can analyze how they performed in similar time frames. We’ll pass these into Google Analytics as Custom Dimensions so they’re available for analysis. It’s actually much easier than it sounds! (more…)

Segmenting Google Analytics by Session Frequency


Segments are one of the most powerful features of Google Analytics, and they are often useful for zeroing in on the sets of users who are most valuable to us.

One way of looking at potentially valuable users is to look at the frequency with which they visit the website. Let’s look at a couple of ways to do that in GA.


Data Processing Options for Google Analytics and Big Query Export


In this blog post, I evaluate several of the numerous (and potentially overwhelming) options for the processing and reporting of Google Analytics data. The default  Google Analytics web interface is great for quick ad hoc data exploration, but limited for deeper analysis and the development of automated reports.

Whether we’re mining for hidden trends or trying to report on hard-to-extract dimensions, there are a number of third-party tools out there can that help ease the burden.

In the first half of this article, I explain the difference between the two types of Google Analytics data: what’s available from the standard interface and what’s available through the BigQuery export.

The second half of this article is an evaluation of three different solutions for processing, visualizing, and reporting on Google Analytics/GA BigQuery data. I evaluate these three solutions (ShufflePoint, Tableau, and R) based on objective features and my subjective scoring of performance.

I only evaluate three data processing solutions in this article. Think I missed a good one? Let me know!! We all have a different background in data analysis tools, and I would love this conversation to continue in the comments section.

Dynamic Data Viz: A Better Way to Plot Rows in Google Analytics

Have you ever tried to use the “plot rows” feature in Google Analytics and it literally falls flat?

It happens because you can’t keep the chart from graphing the metric total. That thick blue line across the top of your chart flattens everything else. It keeps the size of the chart static, rendering it useless.
Drink me potion and key on table next to miniature door

Wouldn’t it be great if you could graph only the rows you want and the chart would dynamically resize?

Here’s the key to turning those flat, plotted rows into dynamic data visualizations: motion charts. (more…)

How a Local Business in Seattle Can Gain Street Level Insight in Google Analytics


Did you ever want micro-level geographic information inside Google Analytics? What if you really need “street level” knowledge about your users; like where are they, what neighborhood are they in? Often, when we talk and write about Google Analytics we’re thinking about the big guys. National or even International traffic, filtering by country, comparing one region to another. We’re thinking macro, not micro.

I wrote previously comparing DMA areas to gain insight, but that’s really only helpful if you have a true national or bigger presence. What if you’re just a local Seattle business, and don’t really have much call for looking at traffic outside the Seattle-Tacoma metro area?

Well, first thing you should do is think about taking our Seattle Google Analytics, AdWords, and Tag Manager Training (shameless plug). Second, read on…

Seattle is actually ahead of the game when it comes to data, which is the real reason I’m using them as an example. The city has a Chief Technology Officer, and data.seattle.gov was started in 2010 as a central hub for all local Seattle data. In fact, a number of businesses claimed that the use of this local data helped them with their businesses.

How so? Well, if you’re a local business then the traffic from, and information about, the Queen Anne neighborhood of Seattle might be more important to you than Downtown or Riverview.

But how can you use Google Analytics to help you on this sort of granular level? Also what if you DO care about national level data, but you care about it on a very granular local level as well, maybe looking for interest in your brand to help place billboards, or expand your franchising? The truth is that you can’t, at least not right out of the box. But with a few very easy additions, you can start getting some great local data that can let you make street level decisions about your business in Google Analytics. (more…)

Duplicate Transactions in Google Analytics – The Check and the Fix


By far the most common issue I’ve come across with ecommerce sites; duplicate transactions can inflate revenue and ecommerce metrics, altering your attribution reports and making you question your data integrity.

When talking about where to put the ecommerce tracking code, Google suggests the following for Universal Analytics:

… If successful, the server redirects the user to a “Thank You” or receipt page with transaction details and a receipt of the purchase. You can use the analytics.js library to send the ecommerce data from the “Thank You” page to Google Analytics.”

The missing step here is to ensure that either A) the user cannot access the page more than once or B) you have logic in place to make sure the transaction is only sent once. The biggest issues I’ve seen are when this receipt page is automatically emailed to the customer, with the ability for them to return as frequently as they please, each time sending a duplicate transaction.