Google Analytics

How To Remove Referral Spam from Google Analytics

JannelleChemko
ByJannelleChemko

What is Referral Spam?

Referral spam is a technique used by spammers to make several repeated requests to a website via a fake referral URL, so that the spammer can advertise that website.

I’ve noticed over the last few months that the amount of referral spam showing up in Google Analytics has started to increase to the point where it really skews the data. Around the web there has recently been a bit of buzz on this issue, and more and more posts have been popping up, giving advice on various techniques for filtering this traffic, or even hitting the server to begin with.

Referral_Spam_Examples_in_Google_AnalyticsThis can be very frustrating and waste a lot of time, as it’s gotten to be up to 5-10% of overall traffic in some cases, even for those sites that have substantial monthly visitor counts.  As a result, the data has to be manipulated more to get a good sense of where traffic is actually coming from. Also, it can be annoying to have to continually explain to a client what semalt and 100dollar-seo is, and why it keeps showing up in their analytics when it isn’t actually helping them.

Since Google hasn’t taken any steps at this point to help users deal with this issue, we needed to take this into our own hands, and tried to find a solution that was easy to implement and solved the problem without causing further difficulties. So here we went, on a little journey…

Failed Solution 1: Updating the .htaccess file

Back in February of this year, the first method we tried was to actually manipulate the .htaccess file, to prevent the traffic from even hitting the server in the first place. This involved a few steps that our development team could easily take, to add in all of the offensive URLs and banish them from showing up forever. Once this was implemented, however, we noticed a huge slowdown on a couple sites – the more pages, the worse the site speeds got. With every request, the referrer was being checked multiple times and it added up a lot. In our case, average time to first byte for every file was about 500-600ms after the change, while it was 100-150ms before the file was updated.

As a result, we decided to revert back right away, and put up with the skewed data until finding another workable solution.

Failed Solution 2: Adding Domains to the Referral Exclusion List

Another option that we looked at was setting a Filter to add the unwanted domains to a Referral Exclusion List under Admin –> Tracking Info.

Well, it turns out that this didn’t work either, as this doesn’t end up actually removing the Users and Sessions, but just moves them to be added to your Direct Traffic stats instead. After running this option for a few more days, I stumbled into an article by Black Belt Robots that showed that this was indeed the case, and helped me towards my current solution.

After these two failed trials and tribulations, we’ve managed to find a solution that isn’t complex and doesn’t take too much time.

Our Steps to Success!

Step 1: Setup Bot Filters

One thing that you should definitely be doing as a base starting point is to go to the Admin tab of your Analytics account and under specific View, select Bot Filtering to exclude all hits from known bots and spiders (although I’m curious to know how much of an impact this has at the moment).

Google_Analytics_Bot_Filtering

Do this as a starting point at a very minimum, and then continue on to the next steps to actually remove more of the culprits, which this doesn’t seem to capture.

Step 2: Filter by Campaign Source

After not getting the data we wanted from adding domains to the Referral Exclusion List, we tried to setup a new Filter instead. This actually resulted in removing the unwanted traffic from showing up completely, as far as we can tell (again, props to Black Belt Robots).

To set up this filter, start by going to your Google Analytics View, and create a new Custom Filter that excludes by Campaign Source (we named it Exclude Spam Referrals).

Google_Analytics_Exclude_Referral_Spam_Filter

Fill out the Filter Pattern field with the pesky domains that you need to exclude, and then feel free to click Verify this Filter to ensure that it will indeed remove that traffic.

Some of the domains that we have found to be the most offensive include the following (with formatting):

event-tracking\.com|best-seo-offer\.com|free-share-buttons\.com|buy-cheap-online\.info|get-free-traffic-now\.com|free-social-buttons\.com|buttons-for-your-website\.com|weburlopener\.com|webmaster-traffic\.com|100dollars-seo\.com

it-max\.com\.ua|billiard-classic\.com\.ua|ci\.ua|mirobuvi\.com\.ua||simple-share-buttons\.com|social-buttons\.com

As there is a character max limit in the Filter Pattern field of 255 characters, we had to setup two separate filters to account for all of the offending domains.

To make sure you gather an all-inclusive list of potential domains that may be causing you issues, just go to your Acquisition Referrals Report, and look for strange sounding domains – they’re pretty easy to spot.

We now continue to check about once per month to see if there are any more spammy domains that we need to add to the existing filters, and recommend you do the same.

Step 3: Add an Annotation

One thing to note is that any of your historical data will not change, and just what you see once you’ve setup a filter. As a result, it’s a good idea to add a new Annotation every time you are doing something significant that may be impacting your analytics.

Google_Analytics_Create_Annotation

All we did here was setup a new annotation for the date that we setup these new filters (this also helped us each time we made a failed attempt earlier, so we could track what may have been causing issues or what was actually leading torwards progress).

Alternate Solution:

Although we haven’t tried this one yet ourselves, I’ve come across another solution, from Distilled, that appears to be even more simple to set up.  This solution was just published a couple days ago.

It’s based on the premise that most spam sessions fall into one or both of the following categories:

  • Invalid hostname (i.e. not your site)
  • Screen resolution = “(not set)”

and thus, two filters can be setup for each to ensure that this traffic is excluded.

Hostname Inclusion:

hostname_inclusion_filter

and Screen Resolution Exclusion:

screen_resolution_exclusion_filter

Since we’ve already created our other filters, we won’t be fiddling around with this for now, but will definitely be keeping this in mind for greater ease in newer Analytics accounts, should the solution have staying power.

Conclusion:

The good news is that, even though this issue can be a huge pain, you do have options. I recommend giving these solutions a try (and avoid making the same mistakes I did).

Please let me know if there are any other issues or solutions that you’ve found to help others not have to go down the same broken path and enjoy referral spam-free Google Analytics data!

About the Author

JannelleChemko

JannelleChemko

Numbers Ninja & Digital Dynamo
Jannelle Chemko has been working in Operations and Accounting since 2007. After earning a Bachelor’s Degree in English, she is now in the midst of her CGA designation.

As strange as it sounds, Jannelle is a numbers and a letters guru: in addition to extensive full-cycle accounting experience in the technology and retail industries, Jannelle is also passionate about writing. In between crunching numbers and building excel reports, she researches, creates content, and keeps up to date with digital trends.

When she’s not working to meet school and month-end deadlines, you can find Jannelle outside walking her dog, and enjoying the beautiful Vancouver air.
Follow Me On: Facebook

You may also like...

By continuing to browse or by clicking “Accept” you agree to the storing of first- and third-party cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts.
Cookie policy | Privacy Policy

Privacy Preference Center

Close

Your Privacy

Umami Marketing Inc. appreciates your interest in its products and your visit to this website and respects the privacy and the integrity of any information that you provide us as a user of this Site. The protection of your privacy in the processing of your personal data is an important concern to which we pay special attention during our business processes.

Privacy Policy

Required
Personal data collected during visits to our websites are processed by us according to the legal provisions valid for the countries in which the websites are maintained. Our data protection policy is also based on the data protection policy applicable to Umami Marketing Inc. Read more

Cookie Policy

Required
Umami Marketing uses cookies and similar technologies, such as HTML5 web storage and local shared objects (all referred to as ‘cookies’ below), to record the preferences of users and optimize the design of its websites. They make navigation easier and increase the user-friendliness of a website. Read more

Essential cookies

These cookies are essential for websites and their features to work properly. Without these cookies, services such as the vehicle configurator may be disabled.

Cookies used

  • WordPress Required

Performance Cookies

These cookies collect information about how you use websites. Performance cookies help us, for example, to identify especially popular areas of our website. In this way, we can adapt the content of our websites more specifically to your needs and thereby improve what we offer you. These cookies do not collect personal data. Further details on how the information is collected and analyzed can be found in the section ‘Analysis of usage data’.

Cookies used

Third-party cookies

These cookies are installed by third parties, e.g. social networks. Their main purpose is to integrate social media content on our site, such as social plugins.

Third-party cookies