top of page
Writer's pictureRobin Rozhon

Debugging Google Analytics via Log Files

You’ve often heard that log files are great for SEO (Listen to those people. They are right!). But don’t think of them as a simple SEO tool. Log files can be useful in many cases.


Need an example? Troubleshooting Google Analytics!


The canonical URL for our homepage used to include a subfolder. A simple fact caused this: there’s nothing in the root folder. Our homepage lives in a language subfolder so because the root (e.g. https://www.domain.com/) had been redirected via 301 to the subfolder (e.g. https://www.domain.com/en/), Google had used the subfolder (https://www.domain.com/en/) as our homepage.


Google search results before the change


For several reasons, we decided that we wanted Google to display URL with no language subfolder as our homepage (e.g. https://www.domain.com/).


Google search results after the change

We got this new canonical URL to search results simply by switching a 301 redirect to a 302 redirect.


Immediately after that, organic traffic to the homepage significantly dropped.


Initially, we thought that the site lost visibility for important keywords. This was a reasonable assumption but knowing that the vast majority of organic traffic to the homepage is from branded search queries, we were skeptical about this conclusion.


Google Analytics data is wrong!

Being one of Canada’s greatest brands all time and having over 5 millions members makes it almost impossible to lose ranking for our own brand.


As expected, Search Analytics report in Google Search Console, STAT Search Analytics, and SEMrush reported no changes in ranking.


The only logical solution was to blame our tracking.


Digging into Google Analytics revealed a suspicious increase in direct traffic to the homepage. Strangely enough, the increase appeared on the same day as the decrease in organic traffic to the homepage.

Organic traffic to the homepage vs. direct traffic to the homepage

This made us even more confident there was no real drop and that Google Analytics was attributing a part of organic traffic to direct traffic.


Log files are a gold mine

In our quest to find answers, we dove into the source of truth – log files.


We are fortunate to have ELK stack (Elasticsearch, Logstash, and Kibana) implemented. This allows us to analyze log files in real-time!


Report 1: Requests to the subfolder (https://www.domain.com/en/)

Report 2: Requests to the root folder (e.g. https://www.domain.com/)

The two reports above told us two things:


  • The number of homepage requests didn’t drop – the number of requests for the root folder is very similar to the number of requests for the subfolder before the change.

  • A problem occurs during 302 redirects from the root folder to the subfolder because had everything worked properly, we would never have seen the drop in the first report.


Comparing one session before the change with one session after the change gave us the final proof we were looking for.


The referrer was overwritten during the 302 redirects.

Overwritten referrer

After chatting with our engineers and conducting some testing, we figured out that our WAF (web application firewall) was causing this.


Once we identified the cause, it was easy to introduce a fix that prevents overwriting of the referrer.


Traffic source attribution

It would be difficult to identify the cause of this misattribution without log files because we could not reproduce this while using Google Tag Assistant extension or Google Analytics Debugger.


Log files are an invaluable source of information and I highly recommend you get familiar with them. They may help you understand what crawlers see on your website or troubleshoot Google Analytics.


Do you use log files? If so, please share what you used them for. I’m genuinely interested in hearing other use cases.

Comments


bottom of page