Upcoming LunaMetrics Seminars
Los Angeles - Anaheim, Sep 8-12 Washington DC, Sep 22-26 Boston, Oct 6-10 Chicago, Oct 20-24

Logging Raw Google Analytics Data using Keen IO

blog-keen

Google Analytics export to BigQuery is great for getting at the raw session-level data of Google Analytics. But, it’s only for GA Premium (GAP) subscribers. If you have other reasons to need GAP – like increased sampling limits, DoubleClick integration, or additional custom dimensions — and you have the money to spend, GAP is a great option.

Raw GA data?

But what if you’re not a GAP subscriber? Can you still get the raw, session-level data?

In a word: no (at least not from GA). All of the data in GA reports and in its associated reporting APIs is aggregated data. You can create and export reports full of dimensions and metrics, but there’s no report that can give you all of the information for each session the way BigQuery can.

We can do this

Fortunately, there’s an alternative: send the raw data into a repository where it’s easily accessible. Especially if you’re using Google Tag Manager, it’s pretty easy to fire an additional tag at the same time using the same rules as Google Analytics.

There’s a third-party tool that’s perfect for this kind of logging: Keen IO. Keen IO is intended for gathering unstructured event data from any source you like: websites, games, devices. I say “unstructured” because you are free to send any kind of data you like; there are no requirements except a timestamp and whatever set of properties you’d like to record.

Keen IO is a subscription software service. You don’t have to worry about any of the details of how the data is stored or where, and you pay based on the number of events you send per month (up to 50,000/month is free, so it’s easy to try out with no commitment, and it scales at reasonable prices from there). You can send data using a really simple REST API and JSON for the data, or there are ready-to-go SDKs in a variety of languages (including JavaScript, which is good for us in collecting web data).

There are also simple APIs for querying or exporting the data for analysis. Keen’s query APIs are fairly limited (compared to BigQuery, for example), but for moderate volumes of data (like the number of hits within the non-Premium GA limit of 10 million/month), they’re absolutely fine. Keen doesn’t have any built-in reports, so you’ll be pulling queries or extracting data to use with another tool for analysis or visualization (just like BigQuery).

(Would I recommend that you only use Keen IO and drop GA altogether? Definitely not. They’re both great at what they do. Keen IO fills a gap in GA, but it doesn’t recreate all the functionality of GA.)

How it works

  1. Keen IO JavaScript library and configuration. First, you’ll have to sign up for a (free) Keen IO account. Then we can use Keen IO’s JavaScript library by including the following script in your pages:
    <script type="text/javascript">
      !function(a,b){if(void 0===b[a]){b["_"+a]={},b[a]=function(c){b["_"+a].clients=b["_"+a].clients||{},b["_"+a].clients[c.projectId]=this,this._config=c},b[a].ready=function(c){b["_"+a].ready=b["_"+a].ready||[],b["_"+a].ready.push(c)};for(var c=["addEvent","setGlobalProperties","trackExternalLink","on"],d=0;d<c.length;d++){var e=c[d],f=function(a){return function(){return this["_"+a]=this["_"+a]||[],this["_"+a].push(arguments),this}};b[a].prototype[e]=f(e)}var g=document.createElement("script");g.type="text/javascript",g.async=!0,g.src="https://d26b395fwzu5fz.cloudfront.net/3.0.7/keen.min.js";var h=document.getElementsByTagName("script")[0];h.parentNode.insertBefore(g,h)}}("Keen",this);
    
      var keenTracker = new Keen({
        projectId: "your_project_id",
        writeKey: "your_write_key"
      });
    </script>
    

    This just loads the library and sets up a tracker object (here I’ve called it keenTracker) with the project ID and write key that Keen supplies you when you sign up.

  2. Tracking pageviews, events, and any other interactions. To log something to Keen, we can use the following script:
    keenTracker.addEvent("www.example.com", keenData);
    

    Of course, we’re going to have to fill in that data bit (it takes a JSON object) but we’ll get to that in a second. If you’re using GTM (which will be by far the easiest way to do this, rather than hard-coding in your pages), you’ll probably want to create one or more tags with the code from step 1 plus the code from step 2, that then take various macros and information from your dataLayer to fill in the data (which we’ll get to next).

  3. Leveraging Google Analytics & Tag Manager data. So how are we filling in all of this data? First off, let’s leverage all the things GA or GTM can tell us. GA has a client ID (a unique ID for the device which is used to count users), as well as a bunch of information about the size of the screen and other details we might be interested in. You can access these properties by using the get command in GA, like this:
    ga(function(tracker){ 
        dataLayer.push({
            "clientId": tracker.get("clientId")
            })
        })
    

    Here I simply pushed the value to the dataLayer, since it will be easy to grab in GTM. Besides clientId, there’s a slew of system details you may want to grab like screen size and so on – the field reference for GA gives all the names of these properties. Note that these values won’t be available until after GA does its thing, so you want to use tag priority to manage their order.

    Besides the stuff GA knows, GTM can also get the value of items such as the URL of the page, the referrer, the values of cookies, and more. Use whatever you need!

    Then we’ll likely want to fill all this in for the Keen data using some macros (from the dataLayer or anywhere else you like). In your tag, before the addEvent command, let’s set up the data:

    var keenData = {
        'clientId': {{Client Id macro}},
        ...and so on...
    };
    

    You may need a couple of different kinds of tags (one to match the types of data in GA pageviews, another in GA events). You’ll want to set them up on the same rules as your GA tags in GTM.

    (Keen even gives a full recipe for capturing pageviews, although we’re short-circuiting some of the steps by leveraging information that already exists in GA & GTM.)

  4. Enrich data with Keen’s add-ons. Keen also has some add-ons that help enrich the data. For example, it can capture the user agent (browser type and version) and parse that into its various pieces. The most important of these is for geo-location by IP address. If you capture the IP address in the data you send, you can have Keen add a city, state, country, and postal code. That’s something that GA also does, but it happens after the data is sent, so we’ll want to replicate this in our Keen data. More documentation on Keen’s add-ons here.

Reading the data

Once the data is in Keen IO, it’s just a big list of all the properties you sent. You can query it using Keen IO’s read APIs, which let you do a variety of types of queries from simple counting and summing, to filtering and funnel analysis.

This process is a little different from BigQuery, in that Keen uses a simple REST API with parameters to define the query rather than a SQL-like query syntax. Many of the most common tasks can be accomplished through this API, but note that you can also use the API to extract an entire data set to another tool (BigQuery, even, or to an analysis or visualization tool like Tableau or R).

Enhancements

There’s lots of additional logic you could use to enhance this data and make it even better:

  • Implement a cookie to sessionize events on the client side.
  • Process GA campaign tags contained in URL query parameters into the data.
  • Leverage GA’s new tasks feature to improve your Keen IO data collection. For example, you could abort the event if the page is rendering in a “Top Sites” preview in Safari, or check whether cookies are enabled.

Other tools

Snowplow Analytics is similar to Keen IO, and is available both as software-as-a-service, as well as open source software for running your own data warehouse on top of technologies like Amazon S3 and Redshift for storage. It’s a solid solution, and in some ways even better suited than Keen IO to this problem, albeit a bit more complicated to set up since it relies on underlying Amazon Web Services. Definitely worth taking a look at if you are seriously considering a solution like this one.

BigQuery itself recently started supporting streaming data import (rather than batch jobs), but given the way authorization works in BigQuery, it’s not really appropriate for client-side tracking. If you were sending data from the server-side, it would be a possibility.

Whatever tool you choose, it should be obvious that the power of Google Tag Manager and one of these data logging tools in combination can empower you to collect raw interaction data for research and analysis.

Kick Off 2015 with a LunaMetrics Training

blog-2015-Trainings

As we look towards the end of this year and the beginning of 2015, consider how a training in Google Analytics, Google AdWords, or Google Tag Manager may help your career! Choose from seven different cities in the first quarter, ranging from Boston to San Francisco, with stops in Chicago and Denver along the way.

With trainings in cities around the country, we hope you can find a location that is easy to travel to and fun to explore!

Whether you’re just starting out in a new field or looking to get a deeper understandable of the tools you’re currently using, we have a class for you.

Learn how to better collect and analyze your data with our Google Analytics series, futureproof your website with the flexible Google Tag Manager, or drive qualified traffic to your site through paid search with our Google AdWords trainings.

Choose an option below to learn more about the specific topics we cover and decide which trainings would be right for you!

Google Analytics Google AdWords Google Tag Manager

Read More…

2014 Free SEO Training for Pittsburgh Students & Non-Profits

blog-seo-workshop

In 2013, LunaMetrics hosted its first free SEO training, designed for local students and recent graduates and partnering with local non-profit organizations. The event was so successful for all who attended that LunaMetrics will offer the free training again this year, over the weekend of October 18-19.

The students who were chosen to participate in last year’s program left with knowledge of SEO best practices and experience optimizing a website for search engine traffic. These employable skills and experiences could be added to their professional résumés to help kickstart their professional careers in essentially any field.

LaToya Johnson, then a Carlow University MBA student, participated in the 2013 training. She gave this advice, “…I would encourage other students to take advantage; not only will you gain knowledge, great networking opportunities, and a certificate. You may also discover a passion that you didn’t know that you possessed.” Read More…

The 5 Most Asked PPC Questions at a LunaMetrics AdWords Training

Blog-AdWords

While Google AdWords is a terrific platform for getting your advertising message in front of the right audience, it can take years to master. That’s why we offer our Google AdWords Training courses. The sessions are a terrific opportunity to get your questions answered and learn everything you need to know to maximize your ad spend and generate revenue for your business.

It doesn’t matter if you are brand new to pay-per-click advertising or a seasoned pro, you will learn about strategies and settings to help you maximize your account. Every training is unique as attendees work in different industries and have different business models. We really try to speak to specific examples in attendees’ accounts and industries.

Fortunately, our trainers have years of experience managing AdWords accounts for a wide variety of business types and actively work on accounts in addition to providing training, so you can be sure the recommendations you receive are time-tested.

However, some questions come up during each training session, and rightly so, as PPC advertising isn’t cut and dry. As I look forward to my next training (2 weeks away in Los Angeles!) I thought it might be helpful to review some common questions. Read More…

Technical and Webmaster Guidelines for HTTPS

blog-https-large

Spurred on by the Edward Snowden revelations, Google has begun taking security more seriously. After the revelations came out, Google quickly secured and patched their own weaknesses. Now they are pushing to encrypt all internet activity by incentivizing websites that use SSL certificates by giving them a boost in rankings.

During a Google I/O presentation this year called HTTPS Everywhere, speakers Ilya Grigorik and Pierre Far made it clear that this move is not just about encrypting the data being passed between server to browser, but also to protect users from having the meta data surrounding those requests collected.

Though the meta data collected by visiting a single unencrypted website is benign, when you aggregate that data it can pose serious security risk for the user. Thus by incentivizing HTTPS, Google has begun to eliminate instances on the web where users could be vulnerable to having information unknowingly collected about them.

I will give you the spark notes version of the HTTPS Everywhere presentation, but even that will warrant a TL;DR stamp. My hope is that this outline and the resource links contained within it give you a hub you can use when evaluating and implementing HTTPS on your site. Read More…

Use Excel to Analyze Keywords That Rank for Multiple Pages

MultiplePages-large

Here’s a quick tutorial on how to use Excel to analyze the keywords that have more than one of your site’s pages ranking in Google organic search results.

Your site may have plenty of keywords that have more than one landing page ranking for a variety of reasons. For example, when someone googles “Google Analytics Training”, there are many different LunaMetrics pages that might display, based largely on where the user is located.

Let’s look at how we can break these out and analyze them further. Read More…

Access 404 Error Metrics Using Google Tag Manager

404-Error-Blog

As analysts and marketers, we always want to track positive performance metrics and conversions in Google Analytics. However, tracking errors is also important to monitor the health of your site and keep track of signals indicating a negative user experience.

Accessing this data gives us a better idea of what’s causing users to get lost and wander into the dark, unattached voids of your domain. Knowing where these problem spots are makes it easier to fix internal links or set redirects.

I’ll show you different ways to view where people are hitting these error pages and where they are coming from, either through your existing setup or by using Google Tag Manager to fire events or virtual pageviews. Read More…

8 Essential Tips for Great Business Phone Call Etiquette

8-tips-blog

Raise your hand if you’ve heard a co-worker say “Ugh, I’ve gotta jump on a call”! Most people don’t look forward to phone calls with clients. There’s the inherent fear that you’re not prepared (It’s hard to imagine the audience in their underpants when you’re only calling one person across the country), or that you don’t have the right report or solution lined up.

If you work in the Search & Analytics fields like we do at our office, it’s quite possible that you have not and will not meet certain clients face-to-face due to distance, so building rapport can be a challenge. You just don’t get to shoot the breeze on the phone like you might during an on-site visit or lunch with your client.

In fact, relationship building is my favorite part of working with clients. Helping them succeed and meet their objectives helps me succeed and meet mine, so I invest in good client relationships wherever I can.

If you don’t share my excitement over client calls, I’ve assembled the following presentation to help you ease any fears when preparing for and executing your next client call.

Read More…

Understanding Bot and Spider Filtering from Google Analytics

blog-spider

On July 30th, 2014, Google Analytics announced a new feature to automatically exclude bots and spiders from your data. In the view level of the admin area, you now have the option to check a box labeled “Exclude traffic from known bots and spiders”.

Most of the posts I’ve read on the topic are simply mirroring the announcement, and not really talking about why you want to check the box. Maybe a more interesting question would be why would you NOT want to? Still, for most people you’re going to want to ultimately check this box. I’ll tell you why, but also how to test it beforehand. Read More…

20 Google Facts & Stats that Every Marketer Should Know

Google-Facts-blog2

No company dictates the online marketing industry and all of our careers like Google. Regardless of whether you use the company’s products, your customers do and that leaves you no choice but to become a Google expert.

This post outlines 20 things that every marketer should know about Google. Some are huge (and somewhat unimaginable) dollar figures. Others are market share percentages. The one thing they all have in common: you need to know them.

If we missed any important facts, please let us know in the comments. Read More…