The spam in Google Analytics (GA) is becoming a serious pain. Last days many people asked me in Web and SEO for Photographers Facebook Group how to deal with this spam. Over the last couple of years, we’ve seen pretty weird things showing up in our Google Analytics reports but nothing like the spam that is being used now. It’s not only that this new spam is sent as a language, instead of the common referrer spam, but also it has a fake secret Google domain and even a message supporting trump for the past elections!

Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!

So far I was using plenty of filters in my Google Analytics to stop this spam. My .htaccess on the server is huge, just to stop this crap. Today I’ve found better solution how to remove referrer spam in Google Analytics. So lucky you! Continue reading to find out hot to fix that. Just be careful, you need some basic set of skills in Google Analytics to make things right. If you don’t have time to deal with this or if this is too complicated to you I can set everything for you, plus I can check few other important options in your Google Analytics to ensure that your receive only clean and meaningful data. Simply CONTACT ME.

So let’s go, time to remove some crap spam from your Analytics reports!

1. Create your hostname filter for ghost spam

Your hostname filter will prevent most of the spam from:

  • sites like all the share-button
  • fake compliance cookie sites
  • site-auditor
  • spammers impersonating legit sites
  • and most of the “secret.Google.com” language spam.

This filter will be for your hostnames. So as long as you add all of them you don’t have to worry, you won’t exclude any real traffic. The main characteristic of ghost spam is that it never visits your site. Instead, it uses the measurement protocol to reach your Google Analytics directly. For that reason, this type of spam always leave a fake hostname or leaves an “undefined” hostname which will appear as (not set) in your reports.

Find your Hostnames

To get to the list of hostnames you should go to the network report in your Analytics and select the blue text “Hostnames” at the top of the reports. Make a list of all the valid ones.

Build you Hostname regex

Once you have the list of all your hostnames, you have to create a regular expression (REGEX) that contains all of them. It is important that you add all your relevant hostnames, or you run the risk of losing valid data. Here are some example of regex for any domain type with (-) etc

tomrobakphotography\.com|cdn\.tomrobakphotography\.com|www\.tomrobakphotography\.com|sample\-domain\-tomrobak\.com
Few tips:
  • To separate each hostname, you need to use a bar or pipe character | ;
  • The dot . and the hyphen – are considered special characters in REGEX so you should add a backslash \ before them;
  • Don’t leave any spaces;
  • The REGEX has a limit of 255 characters;
  • Don’t add a pipe/bar |, at the beginning or the end of the expression.

Create the valid hostname filter

Once you are sure the expression is correct, then you can create filter to get rid all of Ghost Spam.

  1. Go to the Admin tab, and select the view where you want to apply the filter
  2. Select Filters under the View column, and select + Add Filter
  3. Enter as a name for the filter Valid Hostnames
  4. In Filter Type, select Custom
  5. Make sure you choose Include and select Hostname from the dropdown.

  6. Copy and paste the hostname expression that you built into the Filter Pattern box.
  7. After making sure your filter is ok, click Save.

2. Creating a filter for Crawler and Language Spam in Google Analytics

Crawler spam is much harder to detect since it uses a valid hostname, so you’ll need a different filter with an expression that matches all known crawler spam. To save you some time, we will use an optimised REGEX for crawler spam that you’ll find below in the instructions, or If you prefer, it can be built the same way as the valid hostname expression. This time, you will use the source (referral) name.

  1. Go to the Admin tab.
  2. Under the last column “VIEW”, select Filters and then click + Add Filter
  3. Enter “Crawler Spam Filter” as a name.
  4. Filter Type > Custom > Exclude
  5. Filter Field > Campaign Source

  6. Filter Pattern > Paste the following crawler spam expression

The following expressions are optimised to block all crawler spam detected over the last couple of years.
Create 1 filter for each expression

# Expression 1

(best|dollar|success|top1)\-seo|(videos|buttons)\-for|anticrawler|^scripted\.|semalt|forum69|7makemon|sharebutton|ranksonic|sitevaluation|dailyrank|vitaly|profit\.xyz|rankings\-|dbutton|uptime(bot|check|\.com)

# Expression 2

datract|hacĸer|ɢoogl|responsive\-test|dogsrun|tkpass|free\-video|keywords\-monitoring|pr\-cy\.ru|fix\-website|checkpagerank|seo\-2\-0\.|platezhka|timer4web|share\-buttons|99seo|3\-letter

# Expression 3 FOR LANGUAGE SPAM
Follow the same steps but instead of “Campaign Source” select Language Settings

\s[^s]*\s|.{15,}|\.|,

3. Enable “Exclude all hits from known bots and spiders”

There are many other crawlers around that are not spam but neither useful for your reports. For example, the ones crawling your site for indexing. This bots will leave a record in your reports if not excluded. In this case, is a bit easier because Google Analytics has a built-in feature to exclude this traffic.

4. Clean up Historical Spam Data in Google Analytics

The spam that is already stored in your Analytics (or any data for that matter) can’t be permanently deleted. That is why it is important to create the filters to stop receiving junk traffic. However, you can still clean your past data affected by spam by using the valid hostname expression you built previously and an advanced segment.

To eliminate the spam from your Google Analytics historical data you will have to create an advanced segment:

  1. In the Reporting section, click the box that says All Users (at the top of the graph). Next click the red button +NEW SEGMENT
  2. In the segment window, almost to the bottom click Conditions
  3. First condition:
    Filter > Sessions > Include
    Dropdown 1> Hostname
    Dropdown 2 > matches regex
    Text box > Paste the Hostname Expression that you previously used for the filter
  4. Click +Add Filter at the bottom to add a new condition.
  5. Second Condition:
    Filter >Sessions >Exclude
    Dropdown 1 > Source
    Dropdown 2 > matches regex
    Textbox > Paste the Crawler Spam expression(best|dollar|success|top1)\-seo|(videos|buttons)\-for|anticrawler|^scripted\.|\-gratis|semalt|forum69|7make|sharebutton|ranksonic|sitevaluation|dailyrank|vitaly|profit\.xyz|rankings\-|dbutton|\-crew|uptime(bot|check|\.com)|datract|hacĸer|ɢoogl|responsive\-test|torrent\-to|magnet\-to|dogsrun|tkpass|free\-video|keywords\-monitoring|pr\-cy\.ru|fix\-website|checkpagerank|seo\-2\-0\.|platezhka|timer4web|share\-buttons|99seo|3\-letter
  6. Click the button Or to the left of the condition you just configured
  7. Third Condition (To exclude the new language spam)
    Dropdown 1 > Language
    Dropdown 2 > matches regex
    Textbox > Paste the Anti-Language Spam expression \s[^s]*\s|.{15,}|\.|,
  8. Enter your segment name and Save.

After saving the segment, you will be able to see spam-free reports, as long as the segment is selected. Eventually, the filters will do their work, and you won’t need to use the segment anymore.

If this article helped you, please consider sharing it or leaving a comment with your experience. It may help other people!

Join our Facebook Group

source: ohow