What the Hit?

In Google Analytics, you will stumble upon the term “hit” frequently. The first time I heard that Google Analytics collected “hits”, I had a hard time understanding what that actually meant. To my experience this is a common case for many new (or not so technical) analysts and marketers.

Yet, understanding the concept of a hit and its relation to a session is crucial for becoming a good (Google Analytics) web analyst as this is the actual data that are sent from websites to Google Analytics. In this post I will dive deeper into hits, and I hope that you will get at least a little better understanding of how Google Analytics works.

What is a hit?

So first things first. What is a hit? According to Google's support page a hit is 'An interaction that results in data being sent to Analytics'.The article further states that 'Common hit types include page tracking hits, event tracking hits, and ecommerce hits'. Alright, so in other words a hit is an interaction and there are multiple types of interactions such as pageviews and events. Still, I find this definition rather vague and hard to conceptualize.

A more practical approach to understanding hits is to actually have a look at one. Fortunately this is really easy. There are various tools and extensions to help you see the hits that are being sent to Google Analytics (my favourite is Google Analytics Debugger). However, it is also fully possible to have a look at the hit data being sent to Google Analytics without any third-party tools. You can actually look directly at the data (HTTP request) being sent from your browser. By pressing Control + Shift + J (Windows) or Command + Option + J (Mac), the browser console opens up.

If you click on the network tab, you will see all requests being sent to and from the browser. You probably have to reload the page to actually see any requests since previously processed requests aren't cached by default.

Once you reload the page, search for “google-analytics” in the filter box in the upper left corner of the terminal and you will see all hits being sent to Google Analytics.

As you might be able to decipher from the blurry screenshot above, three requests show up in the list. The first request is named 'analytics.js', the second starts with 'collect?v=1' and the third starts with 'collect?v=2'. The first request is actually not a hit, but a script that is loaded from the Google Analytics pixel containing the library of all methods used for sending data to Google Analytics. The remaining requests starting with 'collect?v…' are Google Analytics hits.

Click on one of the hits and a new tab will show up called 'Headers'. Here, a lot of information is given about how and what data is being sent. If you scroll down to the last section of the tab called 'Query String Parameters', you will find a list of objects assigned with a value. The first object in the list is 'v' which in my case is set to 1.

The query string parameters are actually the data that is sent and processed by Google Analytics, and each pair represents a key (constant) and its corresponding value (variable). The key often matches a dimension in Google Analytics.

There are a bunch of parameters that can be sent to Google Analytics, so we won't have time to cover them all. But let’s look into some of the parameters that I find the most useful. The first pair in the list has the key 'v' and the value '1'. 'v' represents the protocol version, which for hits to standard Google Analytics properties is 1. The protocol version for hits to the new Google Analytics 4 properties is 2.

Another relevant parameter is 't' which represents the hit type. The hit type must be equal to “pageview”, “screenview”, “event”, “transaction”, “item”, “social”, “exception”, or “timing”. In the screenshot above, the hit type is equal to pageview as the hit is sent when the page is loaded.

'cid' is yet another key that you should have in your vocabulary. It contains the client ID – a unique string used to identify returning visitors. 'tid' is a parameter you might recognize. It holds the tracking ID of the Google Analytics property to where the data is sent.

Lastly, in the bottom of the list there are two parameters starting with 'cd'. These are the custom dimensions which can be sent to Google Analytics. As you can see, in my case these parameters are identical to the client ID and the hit type. The reason why I am sending these values twice is because client ID and hit type cannot be seen directly as dimensions in the Google Analytics UI. Thus I send them as custom dimensions. The complete list of all available (protocol 1) parameters is found here.

There is much more to be said about all the data that can be sent to Google Analytics, but by understanding the basics you should be able to do more research to gain even deeper knowledge. What is as essential to understand as what data that is being sent with the hits is how Google Analytics processes hits and stitches them together into sessions.

As we have discovered, hits are momentary data points demonstrating individual actions taken by a visitor, such as viewing a page or making a transaction. However, when crunching the numbers in Google Analytics you won't see the individual hits, instead you will see sessions. So how does sessions relate to hits?

Hits and sessions

Sessions are essentially a series of hits made by the same user within a given time frame. Every time a hit is sent, Google Analytics's backend checks the time between the current and previous hit from the same user (client ID). If the time between the hits are less than 30 minutes (this is the default time frame, but it can be modified), the hits are seen as part of the same session.

This will continue until there are no more hits registered within the set time frame, and the session ends. There are two caveats that circumvent these rules. The first restriction is that all hits in a session must occur on the same day. If midnight passes the session will break into two even if there is less than 30 minutes between the last two hits. The other exception is that all hits must belong to the same campaign and source / medium, otherwise the session will split when the campaign changes. For a more detailed explanation, read the support page.

This means that a session can consist of everything between a single hit up to the 500 hit limit set by Google Analytics. Unless you create a custom dimension counting the number of hits being sent, there is no way to access this information in Google Analytics without using the API (although in some cases it would be possible to deduce this info from the User Explorer Report).

So what?

So why is it valuable to understand the connection between hits and sessions? Well, to be honest most people won't really benefit from this information. But for those of us trying to 'solve' complex problems such as attribution, clustering, and CLV — it is common to use the raw hit-level data from Google Analytics rather than the processed session-level data. Some real life cases where I have been using hit-level data is when creating a Markov Chain based attribution model in R and for predicting purchases with a classification model with BigQuery ML.

For tracking purposes, it is also invaluable to grasp the concept of hits. It is fundamental knowledge for setting up correct tracking, and fixing bugs, such as the Rogue Referral Problem. So if you want to become an ace in analysis and/or tracking, be prepared to learn more on this topic.

That's it for this blog. Feel free to leave a comment, and thanks for reading!