Marketing Channel Attribution with Markov Models in R


Data empowers us to better understand our users and their behaviors, while methods provide us with the means for analysis. These methods, ranging anywhere from simplistic (i.e. frequency) to complex (i.e. clustering), allow us to choose what we want to understand from the data.

A popular way to understand our users and their behaviors in Google Analytics is through multichannel attribution in the Multi-Channel Funnels Reports using simple heuristics: First Click, Last Click, and Linear Attribution. Although these methods respectively provide insight into the frequency of the first marketing touchpoint, the frequency of the last marketing touchpoint, or a sequence of equally important marketing touchpoints, a data consumer may want a different snapshot. For instance, someone who wants to understand the level of importance and/or the value of each touchpoint in relation to conversions must use a different method: Markov modeling.

Just a few, quick reasons to use this method…

  1. If you don’t have Google Analytics Premium (GA 360), then you don’t have access to the Data-Driven Attribution Model.
  2. Even if you do have Premium, the Data-Driven Attribution algorithm is a bit of a blackbox.
  3. You can use any sequenced data – you are not limited to only using marketing channels.

Let’s dive in!

The Make of Markov

Markov model example from Mapping the custom journey: A graph-based framework for online attribution modeling, October 2014, Anderl, et al.

A Markov model determines the probability that a user will transition from Sequence A to Sequence B based on the steps that each user takes through a site. The contents of these sequences are determined by the Markov order, which ranges from 0 to 4. Use the following as an example:

  • Order 0: Doesn’t know where the user came from or what step the user is on, only the probability of going to any page.
  • Order 1: Looks back zero steps. You are currently at Step A (Sequence A). The probability of going anywhere is based on being at that step.
  • Order 2: Looks back one step. You came from Step A (Sequence A) and are currently at Step B (Sequence B). The probability of going anywhere is based on where you were and where you are.
  • Order 3: Looks back two steps. You came from Step A > B (Sequence A) and are currently at Step C (Sequence B). The probability of going anywhere is based on where you were and where you are.
  • Order 4: Looks back three steps. You came from Step A > B > C (Sequence A) and are currently at Step D (Sequence B). The probability of going anywhere is based on where you were and where you are.

Markov models account for user paths that extend past the order number by acting like a sliding window. Let’s say that User X’s steps were as follows: A > B > C > D > E > F > G. This model would show User X going from Sequence A (A > B > C > D) to Sequence B (B > C > D > E) to Sequence C (C > D > E > F), and so on until User X either exited or converted.

Choosing the best Markov order can be difficult. Without getting into a lot of detail, one way is to plot the training accuracy of the model versus the training standard deviation. The goal is to find where these two lines intersect, or where the model gains variability and loses accuracy equally.

Although it may appear daunting, having a basic understanding of the math behind the model can be helpful as well.

Markov model example from Mapping the custom journey: A graph-based framework for online attribution modeling, October 2014, Anderl, et al.

Luckily, this can be simplified into 3 main parts:

  1. The Transition Probability (wij) = The Probability of the Previous State (Sequence A, Xt-1) Given the Current State (Sequence B, Xt)
  2. The Transition Probability (wij) is No Less Than 0 and No Greater Than 1
  3. The Sum of the Transition Probabilities Equals 1 (Everyone Must Go Somewhere)

The Make of ChannelAttribution

ChannelAttribution, an R library, builds the Markov models that allow us to calculate the number of conversions and/or conversion value that can be attributed to each marketing channel. In other words, ChannelAttribution uses Markov models to determine each channel’s contribution to conversion and/or value.

This model focuses on solving the following issues:

  1. Objectivity – No gut feelings here! Only facts.
  2. Predictive Accuracy – Predicts conversion events.
  3. Robustness – Valid and reliable results.
  4. Interpretability – Transparent and relatively easy to interpret.
  5. Versatility – Not dataset dependent. Able to adapt to new data.
  6. Algorithmic Efficiency – Provides timely results.

It’s also important to keep in mind the following limitations of attribution:

  1. Endogenic – Attribution is relative to underlying conditions.
  2. Not Strict Causal Interpretation – Markov models do not explain 100% of the variance between marketing channel contributions. For instance, certain marketing channels may be inherently more effective in a given setting.

This library estimates the channel attribution by calculating the Removal Effect (si). Essentially, the Removal Effect is the probability of converting when a step is completely removed; all sequences that had to go through that step are now sent directly to the exit node. This calculation is done by running a large number of simulations on the Markov model with the removed step. By default, it runs 1 million simulations. This occurs for each step present in the data.

ChannelAttributionApp – A GUI

If you aren’t too familiar with R, but you’d still like to take advantage of what ChannelAttribution has to offer, there’s still hope! Or, if you would rather see the code, click here.

Use it on the shinyapp server by following this link:

The link should bring you to the following:
Shiny App

As shown in the image, click the “Load Demo Data” button (when you’re ready, you can load your own data by clicking the “Choose File” option under “Load Input File”).

Loaded Demo Data

If you’re using the demo data, the options are preselected for your convenience. Otherwise, you will need to choose the delimiter that separates the values in your data. Then you fill in the column names for your variables:

  • Path Variable – The steps a user takes across sessions to comprise the sequences.
  • Conversion Variable – How many times a user converted.
  • Value Variable – The monetary value of each marketing channel.
  • Null Variable – How many times a user exited.

Then you hit “Run”! After it’s done executing (check the top right corner for progress), click on the “Output” tab. Analyze your results. In a lot of cases, you can see where this model is helpful … in some models, they give too little attribution to the channel versus they give too much attribution to the channel. Click here to jump to analysis section, complete with the bar charts and table from the output.

ChannelAttribution – R Code

If you’re more familiar with R, you might like this option better as you can customize your model and graphs. To get started, follow these steps:

Install and load ChannelAttribution, reshape, and ggplot2. Then load the demo data (or your own):

Next, remind yourself of the variables used in calculating the models:

  • Path Variable – The steps a user takes across sessions to comprise the sequences.
  • Conversion Variable – How many times a user converted.
  • Value Variable – The monetary value of each marketing channel.
  • Null Variable – How many times a user exited.

Build the simple heuristic models (First Click / first_touch, Last Click / last_touch, and Linear Attribution / linear_touch):

Build the Markov model (markov_model):

Perform some quick data munging for total conversions:

And now the fun part… Plotting the total conversions:

Thankfully, the process of creating the Total Value bar chart is very similar to creating the Total Conversions bar chart:

The Long-Awaited Bar Charts, Table, and Brief Analysis

Total Conversions

Total Conversions
The “Total Conversions” bar chart shows you how many conversions were attributed to each channel (i.e. alpha, beta, etc.) for each method (i.e. first_touch, last_touch, etc.). Analyzing the graph, specifically the purple bar(markov_model) in comparison to the other methods, you can gain insights, such as the following:

  • “alpha” was not actually as important in assisting conversions than the simple heuristics found.
  • “epsilon”, “lambda”, “theta”, and “zeta” were more important in assisting conversions than the simple heuristics found.

Total Conversion Value

Total Conversion Value

The “Total Conversion Value” bar chart shows you monetary value that can be attributed to each channel from a conversion.

For instance, you can see the following:

  • “alpha” was not actually not as valuable in assisting conversions than the simple heuristics found.
  • “epsilon”, “lambda”, “theta”, and “zeta” were more valuable in assisting conversions than the simple heuristics found.

Table Form – Available via ChannelAttributionApp (GUI)

Table Form

Furthermore, the GUI puts all this data into a table that you can download and open in Excel if you want to create your own charts.

Open in Excel

Although you can download the table and open it in Excel, it comes in semi-colon separated values. To convert this into usable data, you can follow these steps:
Screen Shot 2016-06-06 at 3.31.04 PM

Select Column A. Then on the Data tab, click “Text to Columns”. It will bring up the following prompt:

Screen Shot 2016-06-06 at 3.32.29 PM

Select “Delimited” and click “Next”.

Screen Shot 2016-06-06 at 3.32.39 PM
Make your bar charts again from there.

Choose the “Delimiter”. In this case, it is the semicolon. Click “Finish”. From here, you can create custom charts to your liking.

Opportunities Abound!

Now that you can get a better understanding of your Google Analytics marketing channel data, there is room to explore additional features of ChannelAttribution, reshape, and ggplot2. Bear in mind that although this library is mainly used for channel attribution issues, you can use it for almost any sequenced data. So get creative, and maximize your data’s potential!

Kaelin Harmon is a Junior Analyst at LunaMetrics. She is pursuing her bachelor's and master's degrees in Data Analytics at Robert Morris University. She's all about bridging the gap between businesses and customers by turning data into actionable insights. If you see someone in a Storm Trooper onesie enjoying delicious grub at a restaurant or exploring downtown Pittsburgh, it's probably her. She's loving her new freedom from the First Order.

  • Doug Male

    Fantastic work but….keeps disconnecting me from App when i try to run my own data?!

    • Stephen M. Harris

      Same here. Seems to be an issue with the Shiny app.

  • Cecio82

    Original paper suggests that higher order Markol Model (k=3 or 4) should provide better predictive accuracy, though it exists a tradeoff between accuracy and robustness.

  • Jesse

    How did you pull the null variable from GA?

    • Michal Prochazka

      You dont need them. Just download session and transactions from GA. Order them base on user and time, create sequences and make a mark for all user and sessions sequence which did not finished with transaction(conversion).

      • Udi Sabach

        Michal, can you please elaborate?

        • Michal Prochazka

          You can use to extract data or some R library and you will see that not all conversion paths are finished by transaction. I can send you some example.

          • Udi Sabach

            i see, maybe i am missing something… but by definition, a conversion path would need to end in conversion. otherwise, it’s not a conversion path.

          • Michal Prochazka

            Conversion path “alfa>beta>gama” can be finished by conversion for 3 users and this same path can be NOT finished by conversion for another 5 users.
            Sorry for using “conversion path” lets switch to “channel sequence”.

          • Al3x4nd3rR

            Hi Michal,

            I still do not get how you are able to pull out MFC-data from GA, and look at non-conversion paths/channel sequence. Can you please share what dimensions and metrics you pull out to get these data?

          • Michal Prochazka

            Hi Al3x4nd3rR,
            just take browserId (custom dimension), date, hour, minute, channel, source / medium, campaign and sessions + transactions.
            Sort data by browserId, date, dour, minute and you have all conversion and NON conversion journeys.
            Of course you can use some filter on those visitors who start shopping like “add to basket”.

            Kind Regards MP

          • Josef

            Ignoring the fact how GA is messing visits from direct channels, such export will be too vague for any serious attribution analysis. No objection, you can get the data, just any calculation on top of such data is not of much use. Btw. you can store the time into the custom dimension as well and use three dimensions for something more useful.

          • Michal Prochazka

            Hi Josef,
            I was elaborating how to get data, personally everyone should be curious about data quality and collection methods fist before starts with any experiments.

          • Jake Weinberger

            Can you please elaborate on this? I don’t understand how you can have a conversion path that doesn’t end in a conversion.

          • Michal Prochazka

            Sorry for misunderstanding, I will try to put it clear. Conversion path in this case is sequence of channels which brigs at least one user to your site. These sequences are growing by each new user visit until he will made a conversion. Each sequence has three fact values, number of user who came by this sequence and made order, sum of orders revenue and number of users without order who follow same path. User which made conversion last week will come to your site next week. He will start new path or “customer journey” by that “next week” visit and “null variable” will be increased for another sequence.

  • Samit Gorai

    I did not get it properly. When you are talking about channel attribution you mean mediums like SEO, PPC, Campaign, Email, Direct etc. Now how come a visitor have so many channel touch points and what about recency factor? This model can be very much applicable for content scoring or assigning $value to various pages but I’m not very confident about channel attribution.

    • Michal Prochazka

      Hey my understanding is that alpha, beta, … are channels as you mentioned. Each sequence contains all user sessions(interactions) which was measured. In terms of recency you can check if “ChannelAttribution” or “MC” implementation calculate recency based on channel order in sequence, but input data dint carry any information about time between last interaction and conversion.

  • Udi Sabach

    Hello. is it possible to generate the table in R?

  • Jon

    Well found Kaelin. Have you managed to find a package that replicates the Game theory solution employed in GA’s DDA model? In many ways the math is simpler in their approach.

  • Анатолий Андрющенко

    Can you explain, where in Analytics reports i can downoad Null Variable???

  • Abhishek Sinha

    really helpful. Could you please explain how we come up with total_conversion_value column data?

  • Brian Maher

    Excellent piece, as always. What are the chances of demo extract excel or example of export data from GA? Would clarify a few points for some people here I think

    • Dina Dicic

      Yes that would really help! I struggle with the export from GA.

  • Abhishek Sinha

    really helpful. Could you please explain how we come up with total_conversion_value column data?

  • Dina Dicic

    Can you please help with how to extract data in this form from Google Analytics?

    • Martin Frotzler

      You can either export it from within GA and import the CSV file into R or go the direct way (which is more elegant). The direct way goes via RGA library and access token, you can look it up. You then access the Multi Channel Funnel Data in R via get_mcf.

      Using the channels can be a bit of a hassle as you come a across a few formating problems. The channelattribution library doesnt like empty spaces and such. But once you dealt with that in your data its quite convenient to use. The good thing is, that the general formatting A>B>C come from GA, so this doesnt need to be addressed seperately.

      Have fun!

      • Dina Dicic

        Thank you!

  • Abhishek Sinha

    Please help. Could you please explain how we come up with total_conversion_value column data?

  • Julio

    Wow, thanks a lot for this. I have been searching for days and this is the first article with hard information.

  • Pierre

    Hello, thanks for sharing this awesome post and code !

    I have a small question regarding attribution with markov model of order > 1. What happens then ? The markov chain becomes bigger so that you also have nodes like (alpha, eta) or (beta, alpha). How can you then calculate the contribution of each channel ? It’s obvious for each node of the graph but not for each channel (to answer the question, what should be my spending repartition).

    I guess one obvious way would be to delete all nodes containing the channel. But I don’t see no theoretical guarantee. Something else could be to use some game theory ? Something like this maybe : ?

    Again, great work and thanks for sharing.

  • John Smith

    Great post thanks. What is not clear, once computing the model how to use it to compute attributions of individual journeys, e.g. suppose you had “alpha-alpha-beta-delta-delta-(conversion)”, what would be the attributions for alpha, beta and delta for this journey using the model?

  • Ray

    Great article!

    One question. Why are there large variances in attribution within the Markov model? In the simplified version, calculating the removal effect seems to be precise with no variance. How does this change when the model becomes more complex with more attribution channels? What is producing the varying results?

    For instance, running the test data twice within shiny will produce different results for each attribution channel within the markov model, with variances ranging from -5% to 15% depending on the channel.

  • Pablo

    I have managed to get the charts. However, the channel names are showing partially e.g.: Dir, Ref and Org showing as ir,ef and rg
    Here is the link which I am referring to:
    I would appreciate any help on this

  • Pablo

    I have managed to get the charts. However, the channel names are showing partially e.g.: Dir, Ref and Org showing as ir,ef and rg

    I would appreciate any help on this


Contact Us.


24 S. 18th Street, Suite 100,
Pittsburgh, PA 15203

Follow Us



We'll get back to you
in ONE business day.