Counting unique visitors without using cookies, UIDs or fingerprinting.

on withcabin.com

over 1 year ago - 4 min read

Building a web analytics service without cookies poses a tricky problem: How do you distinguish unique visitors?

How do you distinguish unique visits without using cookies?

For most cookie-based solutions, it's easy: Store a unique identifier (UID) in a cookie on your computer, so we can identify you when you return. But if there's no cookie, there's no UID... or so you'd think.

Many privacy-focused analytics services will generate and store a UID on the server instead of saving it in a cookie - based on a hash of your User Agent, IP, Location, Date etc. This "fingerprint" is stored in a database and checked every time you visit the site to see if you've visited it before. To improve privacy, it can be washed from the database once it's served it's purpose e.g. on a daily basis. Some analytics services simply rely on the page referrer being on the same domain as your current URL.

Privacy issues

Previous experiments at Normally have revealed that linking data points in a database in any way, such as with a UID, has the potential to reveal someone's identity. Connecting just few data points such as the city, time and visited pages could tell us more than we need to know about that visitor and sometimes lead to their identification in the real world.

While building Cabin, we didn't want to use UIDs at all, and the referrer couldn't be relied upon in all browsers (some browsers and extensions can hide it). So we came up with a different approach.

The solution

Our solution doesn't require a database or anything stored on the server side. It even works in the oldest browsers. Here's how:

When the browser pings our server from a website for the first time, we send back a response with a header set to Cache-Control: no-cache, telling the browser to store the request in its cache but revalidate it with the origin server before each use. But most importantly, we send a header which is a date set to the beginning of each day:

last-modified: Wed, 30 Nov 2022 00:00:00 GMT

From now on, every time this request is made again, the server receives the date and adjusts it by one second, and returns it to the browser:

last-modified: Wed, 30 Nov 2022 00:00:01 GMT

This way, the server can calculate the distance in seconds since midnight to give us a visit count.

The visit count is encoded within the date stored in the cached request on the visitor's machine.

See this in action here 馃憠 demo

Counting bounces too

A bounce is when a visitor lands on your page and leaves that same page without visiting another page on your site. We can distinguish unique visitors (an empty last-modified), but we can also leverage our counter to sum bounces during the day by assuming a unique visit is a bounce until we receive the second request. Following requests are then ignored.

First visit:

visits  uniques bounces
+1      +1      +1

Second visit:

visits  uniques bounces
+1      0       -1

Subsequent visits:

visits  uniques bounces
+1      0       0

Conclusion

Although these headers aren't intended for this, we are only updating and observing them on the server in line with browser standards.

This is great for privacy as we don't need to use cookies, IP addresses, fingerprinting or unique identifiers. In our tests, this method proved durable enough to be the most reliable method of counting unique visitors without using cookies.

We use this method in Cabin, our Privacy-first, carbon-conscious web analytics. Cabin is also compliant with all privacy laws. It's simple, lightweight, and enables you to track carbon emissions across your site. It's a great alternative to Google Analytics. Sign up for a free account to see it in action.


Footnotes:
  • View a gist of this demo as an express server here.
  • The possibility of leveraging the if-modified-since header was mentioned in 2008 by Bart Lateur
  • Read more about Last-Modified on MDN Docs here
  • This article first appeared on Normally Notes. You can read comments on HackerNews

Comments

HomeRSSMusicTwitterGithub PersonalGithub NormallyNPMLinkedin

漏 2024 Nic Mulvaney