Find Internal Links with Campaign Parameters with Screaming Frog

/

campain-parameters-internal-links-screaming-frog
Campaign tracking is a great way to keep track of how much traffic you are sending to your website from external sources (Facebook posts, referral sites, etc.). With Campaign tracking, we append a few query parameters to URLs that point to our website. Those parameters tell Google Analytics information about where that link was posted. This is information that only we know, like which email campaign did this come from or what particular Facebook Ad sent them to your site.

There’s a huge, huge caveat with campaign parameters. We should never use campaign parameters on links on your website to your own website. If you have done this, or are not sure, then follow our steps to help find these links on your site with a tool called Screaming Frog.

A Little Background

Why is it a bad idea to use campaign tracking on links within your website? We actually wrote a blog post already that covers that topic. When Google Analytics sees someone arrive with campaign parameters, it will end the current session and start a new session with the new information. This can lead to tricky issues when trying to determine which channel should get credit for a purchase or goal conversion.

Here’s an example: if a user comes to your site via a paid search ad, their session is now labeled as Source/Medium = “google / cpc.” Let’s say they then they click on a link on your website to go to an internal page also on your website and being tracked with the same GA property, and that link has campaign tracking parameters on it such as this:

www.mydomain.com/about-us.html?utm_source=mydomain.com&utm_medium=mainNav&utm_campaign=aboutUs

… their original session will end and a new session will start. This new session will have a Source/Medium = “mydomain.com/ mainNav” and all behavior from that point forward will be credited to that traffic source. Should that user complete a goal, then inside of Google Analytics, the “channel” that will receive credit for the goal will not be paid search – it will be the campaign you see in the link’s campaign parameters.

Using campaign tracking parameters on internal links makes channel attribution really difficult, in the default reports and also within the Multi-Channel Funnel reports. In most cases, you’ll want to use event tracking to track clicks on internal links instead.

So, let’s find any potential issues and correct them!

What is Screaming Frog?

Screaming Frog is a great tool for investigating various aspects of your website by crawling through the files. It is a tool used often by those in the SEO industry, but we can take advantage of its strength for analytics purposes as well.

We have another great blog post that provides a little background on Screaming Frog as well.

Finding Internal Links with Campaign Tracking Parameters

To have Screaming Frog crawl your website, you need to enter your domain in the search bar and hit “Start.” We’re going to update a couple of settings before we do that though.

First, go to Configuration > Custom from the main navigation.

custom-filter-in-nav

This is where we will add a filter that will allow us to search for any pages on our website that have campaign tracking parameters. Using the Custom option will return a list of URLs, which we can then investigate to see where the links exist on those pages.

In the Custom window, add the following line to the first filter condition:

[yourdomain](.*)utm_source=(.*)utm_medium=

utm-param-filter

This tells Screaming Frog that we want it to search for pages that contain any string of text that matches that regular expression (yes, you can enter regular expressions here even though the condition is “contains”). Note that here we’re expecting the medium to follow the source param, but you might want to just search for one of the parameters if you’re unsure how they might be implemented across your website.

Tip – Go to Configuration > Spider and uncheck JavaScript and images in order to make the crawl run faster.

Now we can run our crawl. Any results of the crawl will start to appear in the table below, such as this:

example-crawl-with-result-selected

Click on any one of them to see more details. You can copy the address of one of the pages that were found and navigate to them in your browser to inspect further.

One example from our website is the blog post on “Campaign Tracking using _setAllowAnchor” which has a couple examples of how campaign parameters look on a URL:

example-result-screen-shot2

This is alright, because we’re not actually linking anywhere – it is just a text example on one of our blogs.

You may find that you have to check the source code to see where the parameters are used in links on your webpages. For example, if there is an actual link on a page on your site that has campaign parameters, they likely won’t appear visually for the user. You’ll have to check out the source code and look for those links.

View the source of the page and then search (CTRL + F) for “utm_” to find any links, to your own site, that have the campaign parameters on them.

link-example

If they are in place on links to pages on your website, you will want to remove them and potentially add event tracking for those links.

Amanda Schroeder is an Analytics Engineer and comes from the marketing industry where she found a need for accurate, insightful data that could aid in making results-driven decisions. Amanda’s passion for building solid measurement strategies and connecting all the pieces of integrated digital and traditional marketing campaigns has led her to her current role at LunaMetrics.

  • Interesting post Amanda! Quick question: how do you deal with very large sites (> 10.000 or 100.000 pages) when running this check?

    • Amanda

      Thanks Paul!

      My best recommendation is to prevent Screaming Frog from crawling files that you don’t need it to. But I’ll admit that I’m no master of Screaming Frog; I use it sparingly to find potential problems for analytics purposes. So I consulted with a colleague of mine here who uses it frequently for SEO purposes, Sean McQuaide, and he provided this extremely helpful response with lots of examples (I can’t possibly take credit for this!):

      This is a common challenge. Because Screaming Frog’s crawl data is saved to the computers RAM, the amount of RAM on your computer and the amount allocated to screaming frog will effect the number of URLs (html, image, js, css, etc.) you will be able to crawl. Screaming Frog’s default allocation is 512mb. Screaming Frog says that amount of RAM should allow you to crawl between 5k and 100k URI. That range is wide because some sites have more URI’s per page than others. Think of a portfolio page (image heavy) vs a wikipedia page. Changing this number is recommended for larger sites. Screaming Frog has instructions on how to do that here – https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#memory. You should also know that by default Screaming Frog will stop a crawl and prompt you to save your data if it begins to reach it’s memory limit.

      Preventing Screaming Frog from crawling images, css, js, shockwave files, and external links is step one to making the most of RAM space. These items are crawled by default. To prevent them from being crawled go to Configuration > Spider, then uncheck each item. https://gyazo.com/18cea247e80256924331674b4fb40924

      If you are still running into a memory issue, then you can test using the include/exclude feature. This features allows you to use REGEX rules to inform the crawler which URLs to crawl and which not to. Include operates as include only, just like GA. It’s great for crawling by directory. Similarly you can use exclude to exclude a certain directory.

      This way of crawling is not great for sites that have a poor URL structure though. Screaming Frog is a web crawler, it starts with one URL crawls that page, then looks for other URLs that match your include/exclude rules. If the site is set up poorly, the crawler might crawl the first URL it finds and then hit a dead end. I have had this happen many times, usually on eCommerce websites. If this happens, move to option 3. https://gyazo.com/9d4a0b0942d423ac116efad779680578

      Option 3: Use a Sitemap. Screaming Frog can crawl URLs using a sitemap. You will enter the URL of the sitemap – or sitemap index – and it will crawl all the URLs in that sitemap. If you use a sitemap index, it will crawl all URLs in all sitemaps. To access this feature, you’ll need to change the crawl mode to List ( Mode > List). https://gyazo.com/2df38f1f42841d1cb83d2bcccd9d1da8 Then click on Upload and select Download Sitemap or Download Sitemap Index. https://gyazo.com/02d045e1bd9d2b261a00fa14388825dc You’ll then be presented with a dialog box where you can enter the URL of the sitemap.

      If you’re using this feature then you likely have a very large site and crawling individual sitemaps might take considerable time and attention. To save you some time, you can break up the sitemap index and have the crawler crawl sections of a sitemap index.

      Example:
      example.com/sitemap_index.xml contains 500k URLs across 100 sitemaps.

      You can save the sitemap index to your computer and use a text editor to break the sitemap index up into quarters.

      example.com/sitemap_index-1.xml has 25 sitemaps totaling 125k URLs.
      example.com/sitemap_index-2.xml has 25 sitemaps totaling 125k URLs.
      example.com/sitemap_index-3.xml has 25 sitemaps totaling 125k URLs.
      example.com/sitemap_index-4.xml has 25 sitemaps totaling 125k URLs.

      You can then host the sitemap segments on any website and use Screaming Frog’s sitemap index feature to crawl these segments of the original sitemap. I say any site because sometimes you don’t have the ability to upload documents to the website you’re trying to crawl. All Screaming Frog needs is a live URL to access and download the list of URLs contained in the sitemap. So if you don’t have access to main sites FTP, then you can upload to a personal site and use that live URL to crawl these segments of the original sitemap index.

      I hope this is helpful to you – best of luck!
      Amanda

      • Hi Amanda and Sean,

        Many thanks for your in-depth reply here! This makes a lot of sense. I am not an expert in using this tool either so this will definitely come in handy.

        Thanks again,
        Paul

        • Amanda

          No problem!

      • AlexanderHoll

        Hi Amanda and Paul,

        Screaming Frog is quickly running out of memory, especially for larger sites. I think however you can have one or two tweaks around this problem. Amanda already described
        1.) In Screaming Frog you limit your crawl to your desired number or folder depth. This you configure in the spider settings (which isn#t really a tweak), but maybe prioritizes your efforts)
        2.) If you really need to have a massive crawl you can combine screaming frog with the amazon web services. Michael describes this in detail over here http://ipullrank.com/how-to-run-screaming-frog-and-url-profiler-on-amazon-web-services/. If you speak german you find a very good guide as well over here http://www.seo-trainee.de/screaming-frog-seo-spider-in-amazon-cloud/
        Best regards from Munich
        Alexander

        • Amanda

          I didn’t know you could combine Screaming Frog with Amazon Web Services – great tip, thank you!

  • Hi Amanda: Good article. I love Screaming Frog! What happens if you add some Campaign tracking parameters to an internal link that do not include utm_source and utm_medium?

    • Amanda

      Hi Les,

      If you are not using ‘utm_’ parameters, you shouldn’t have to worry about Google Analytics tracking issues. Google Analytics specifically looks for those parameters.

      You can adjust the regex filter to look for any custom parameters that you might be using instead of the ‘utm_’ parameters that Google recognizes, if you are curious where those links might exist.

      Hope that helps!

Contact Us.

LunaMetrics

24 S. 18th Street, Suite 100,
Pittsburgh, PA 15203

Follow Us

1.877.220.LUNA

1.412.381.5500

getinfo@lunametrics.com

Questions?
We'll get back to you
in ONE business day.