As regular readers of this blog will know, I am no friend of the surveillance capitalism that currently powers the web. It is becoming well-neigh impossible to develop any software or host any content without some component tracking your users & sending data to third parties.
Even though I try very hard not to fall for this, periodically I discover that my sites also embed things that embed things that potentially leak your details to third parties. This week I found out that the math rendering that comes with the excellent site building software I use is actually served by Cloudflare, something I try to avoid.
Many of us have become inured to every page on the internet talking to all kinds of parties that neither the reader or the author ever consciously wanted to share data with. But I for one still don’t like it.
Even those aware of all this leaking may rationalise it as the price to pay for getting the things we need. But it turns out you can get a lot of what you need without snitching on your audience to random servers on the Internet.
Metrics for your articles
Tons of websites now report their visitors to Google Analytics. Even very privacy sensitive places (including governments) have lost the battle against their marketing departments & caved. The deal here is that as a site operator, you gain insight into your audience. The price you pay is that you share your visitor data with Google.
Before I saw the light, I spent quite some time looking at such analytics, and the graphs sure are pretty. But they rarely told me anything actionable. I doubt the price is worth it. Ask yourself: have any of those fancy maps of where visitors come from ever changed your behaviour?
Also, was it worth the GDPR cookie warning?
Some things you do need to know
But the thing is, if you host or write content, you would like to know if people are actually reading it. Lots of “hits” turn out to be crawlers, bots or scripts. It is also nice to know if your human visitors are making it to the end of your articles, or if they are bailing after 25%.
Last I checked, most of the “pay with your visitors' data” analytics platforms don’t actually tell you this.
Every author is under pressure to make their articles as short as possible. “Kill your darlings” they say, and this is true. Just because you typed it in doesn’t mean it is worth reading.
Simultaneously, it would be great to actually have data to back up if articles are too long or too short. If 100% of your readers are making it to the end of your story, it is a reasonable bet they would’ve liked to read more, for example.
A set of three simple scripts generates graphs like this one:
These are the reading statistics of a rather long-winded article I wrote on the CureVac SARS-CoV-2 vaccine.
audience-minutes.js script reports a proportion of the minutes
readers were active on a page. And with these reports, it also notes how far
the user had scrolled at that point.
What we can see in the graph is that a lot of the samples happened in the first 10% of the page. This could represent “bounces”, folks that land on the article and decide it is not for them.
Secondly, we can see that around half the remaining readers bail at the 40%-50% mark.
And finally, we can also conclude that the remaining readers probably liked what they saw, because there was no further drop-off, straight until the end.
More discussion on what these graphs look like & what they might be mean can be found in the README file.
The majority of sites on the Internet snitch on their visitors to many third parties, for questionable gains. With some simple scripts it is possible to gather statistics on how many minutes are spent on your site, while also learning if readers are making it to the end of your articles or not. Crucially, this software involves no tracking, no cookies, no local storage and no third parties.
audience-minutes.js script can be dropped into most websites, is open source, and
can be found on my GitHub