2025-09-16 08:00:00
I write this blog. Does anyone read it? How could I tell?
In the old days of the web your web server recorded a log when a page was requested, and various tools would analyze those logs to tell you about your visitors. Today these logs are mostly useless when it comes to looking at human traffic, because the majority of traffic is bots, especially now that the AI companies are running their own web crawls.
Some bots like Googlebot label themselves with the User-Agent
header or
ip range
but there are many other bots, including those that identify themselves as a
browser. (My decades out of date recollection is there was a mechanism at Google
as well to fetch pages in a way that didn't appear to be a bot.)
Instead, today's web access logging uses JavaScript. A script on the page gathers information about the visitor and POSTs it to some logging endpoint. This is how, for example, Google Analytics works. Some random site claims it is used on more than half of all websites, which means using Google Analytics gives Google gets yet another hook into where ~everyone on the internet is browsing.
"Telemetry" script is yucky, what could you do otherwise? Here's a trick that doesn't require JavaScript, but also doesn't work.
Embed an invisible image in every page:
<img src=/log width=1 height=1>
Bots that only fetch HTML and traverse links won't hit this logging endpoint.
The Referer
header passed in the hits to the /log
path will tell you the
page the <img>
tag was on.
Unfortunately, bots these days are interested in images too.
What if you do use JavaScript? It turns out the fancy bots run JS too. What is something a bot won't do?
One idea I had is that a bot is unlikely to linger on any given page — they
have other places to go. I tried a script that used setTimeout
to only record
the page load as a visit if the browser hung around for three seconds.
It appears to work better than the other things I've tried, but within a day I spotted the Baidu bot fetching my homepage and then three seconds later fetching the logging endpoint. Is it possible they're actually running the page script and waiting? Maybe I need a longer timer?
This blog is also published as a feed. Feed readers fetch its content and resyndicate it within their own UI. I haven't tried, but I doubt they'd run my script.
Some feed readers, when they fetch the feed, report how many subscribers they are acting on behalf of. Is that a count of human readers? I don't think so. When I used a feed reader in the past, I had subscriptions I sometimes didn't read.
On the one extreme, increasingly savvy bots will get ever closer to appearing like human traffic in logs. On the other, humans read via feed readers or without JavaScript and aren't logged anyway. Heaven forbid someone prints my posts and read them on paper, they're impossible to track!
This problem is an instance of a bigger pattern you might encounter in engineering: sometimes when you get down to implementing a measure, you find an endless maze of increasingly confusing corner cases. What if someone loads the page, but they only distractedly skim it? That's not really a reader, is it? What if someone loads the page and finds what they were looking for immediately, before the logging beacon runs?
What these kinds of problems can indicate is that you need to take a step back and reconsider what your real objective is. What is my objective? I think I write this blog for two reasons.
Taking the time to serialize my thoughts, chasing down all the holes my inner critic spots, is a way for me to consolidate and archive my knowledge. For that purpose I don't need anyone to read it but me.
I write for an imagined audience of another me, someone with my interests and skill level who didn't yet know the thing I learned. Sometimes I write the post that is exactly the thing someone else needed, and they end up reaching out. For this purpose I need an email address, not an access log.
2025-09-13 08:00:00
A while ago I was tinkering with a project involving AI and voice. Google's docs linked to suggested partners.
I created an account with one. Immediately when I tried to log in it refused with an error. I sent a message to their customer service — no response — and forgot about it. A few weeks later I got a personalized email from the company: "Hi, this is John in developer relations, we noticed you made an account but never used it, can we help you?" I immediately responded, "Yes, I want to pay for your product, but I cannot log in, can you help?" Again, no response.
Epic Games made a video game store to challenge Steam, complete with exclusive games. I wanted to buy one. Their store account system was apparently shared with their existing popular games. A gamer in the past had given them my email address to play a game, and though I have always controlled my email address and had never verified any account, they would not let me create an account because they believed my email was already associated with an account.
I contacted their support and they were able to clear the association. But still I could not create an account on the store — another different person was using my email address too!? I think I went through this process with their support with maybe four accounts before I was able to create one.
Blizzard's Battle.net was in a similar state. But they won't let you contact support unless you first log in with your email address. I used the fact that I control my email address to take over the account associated with my email address, but that now means they have a verified randouser1234 account associated with my email address that does not have my purchases on it. (I have a Blizzard account from an old email address that I am attempting to get rid of.)
Someone wanted to send me some cryptocurrency. I prefer pieces of paper with pictures of US presidents, so I tried creating an account on Coinbase, which I understand to be a popular tool for turning cryptocurrency into US dollars.
After creating an account it took me to some intermediate page (I forget the details — something about account verification?) that was half blank. Digging in the browser tools I could see there was some JS exception in React that was getting caught and reported back to the server. I never successfully got an account.
Various sites now offer to use passkeys to let you sign in. I must've clicked ok in the past because whenever I tried it later I got prompted for a PIN that I didn't know, with no visible mechanism to reset it.
Thankfully in this case the guy responsible for passkeys at Google was my old officemate and he was able to walk me through it. Short answer is that Chrome on iOS can prompt you for a "Google Password Manager" PIN, but there is no way to reset it on mobile, nor via the Google Password Manager website. Instead there is a third Google Password Manager UI within desktop Chrome that I needed to use.
As posted there, once I attempted to actually use the passkey to sign into GitHub, I got an error message telling me to use a password instead. The existence of that error text means someone had to write the code that forwards you from one to the other.
For another project I wanted to be able to send a very small number of emails. A friend recommended Amazon SES, part of the AWS suite. I went to create an AWS account: "nope, your email address is already associated with an account". I click sign in using my email: "nope, no account associated with that email".
I contacted them. Customer service sent me a useless reponse that suggested I reset my password.
(From someone else on Reddit with the same problem I discovered a workaround
that appears it might work: you can sign up for AWS with
[email protected]
.)
Forget trying to build new features that attract customers. Here I've given multiple cases where I was already ready to pay and could not do the most basic first step with these products. And I am not even a complex case.
The economies of scale with most big companies like these are that it's not even worth figuring out what any one user's problem is, it's better to just mark them off as X% lossage while trying to find new users elsewhere. I get it, but it's maddening.
My own brother did something (it's unclear, but nothing nefarious) and lost access to his Google account, which meant losing access to many personal files, and for whatever reason all their recovery processes refused him. He eventually gave up.
2025-08-21 08:00:00
Jujutsu ("jj") sits atop a Git repository and its commands mostly mirror into Git operations; for example, a jj commit is a Git commit.
When collaborating with others with Git you push and pull branches. Meanwhile, jj has a feature called "bookmarks" that are the mechanism for working with Git branches, but which have fairly different behavior from Git branches.
This post goes into the why and how to use bookmarks for Git collaboration.
Part of jj's whole deal is that it collapses many Git concepts (stashes, staging, fixups, in-progress rebases, conflicts) into a single unified model of working with history, which then lets you use the same tools to do all of those things. For example, to fix up an old commit you jump to it, edit it, and jump back to where you were; to fix a rebase conflict you jump to the conflicting commit, edit it, and jump back to where you were, using the same commands.
All this jumping around means that the Git idea of being "on" a particular branch does not make sense in jj. When working on a change I might stop part way through doing one thing, start a different thing based on a commit a few steps back, possibly reshuffle commits around, and have a few extra commits on the side with experiments lingering around as well. Based on my former Git expertise I might have done this kind of thing by making a bunch of Git stashes and branches.
Instead, in jj when you work you are "on" a commit, and when you switch you switch between commits, not branches. After a year of using jj I can assure you that not having branch names for these has worked out just fine.
Like a Git branch, a jj bookmark is a name that points to a commit, and there are the commands you'd expect to create/delete/rename and move bookmarks around. Unlike Git branches, bookmarks are fixed to a commit unless you manually move them; when you create new commits jj does not automatically move bookmarks around.
In my experience with jj, I have had no use for bookmarks other than for interacting with Git. In principle you could use them to make note of important commits, which I suppose is where the name comes from. Maybe other people have different workflows.
In a colocated jj/Git repository (which is the normal way to use jj), bookmarks are 1:1 with Git branches: creations/modifications/etc via either system are reflected in the other.
After cloning a Git repository, jj creates "remote" bookmarks with names like
main@origin
. These are immutable and represent the state of the remote
repository.
You could also make a local bookmark named main
that is wholly independent.
But on a fresh clone, the local bookmark main
is marked as tracking
main@origin
. Conceptually this is similar to Git's notion of an "upstream"
branch, but with different behavior.
Suppose main
is a tracking bookmark. jj attempts to keep it in sync with
main@origin
:
jj git push
, if main
is ahead of main@origin
, jj pushes the
changes. (When you're in a state where a push would make a change, jj status
shows the bookmark name as main*
.)jj git fetch
, jj updates main@origin
as well as updates your
local main
if it's behind.If after a fetch the two sides diverge (both contain commits), then the local
main
will be marked as conflicting and point to both commits. This displays
in status as main??
. You will need to manually choose where it points with
jj bookmark set main -r ...
to fix it before using it again.
At least for me this was super weird at first, but now makes so much sense that I cannot remember why I was confused. I think the right way to think about it is that a tracking bookmark is modeling "what I intend this bookmark to be, both locally and remotely" and the jj push/fetch commands keep that in sync.
As distinct from Git, note there is no separate "fetch" vs "pull" commands. (A historical note: apparently both "fetch" vs "pull" commands existed in Git and Mercurial. They agreed that one meant "download the changes" and the other meant "do that and also merge them", but they flipped the meanings!)
If you are just making changes locally and just want to push your changes to
main, you must update the bookmark before pushing with a command like
jj bookmark set main -r @
. This is currently the clunkiest part of jj. There
have been conversations in the project about how to improve it.
If you search for jj tug
online you will see a common alias people set up to
automate this.
If you are comfortable with Git push syntax, an alternative I use for when I just want to push my code is to tell Git exactly what I want to push and where to put it:
$ git push origin SOMEHASH:main
Note this is plain git push
, no jj or any bookmarks involved.
(Update: jj has added its own subcommand for this: jj git push --named main=X
where X is the revision to use.)
If you want to push a bookmark/branch for someone else to review or pull, the commands are:
$ jj bookmark create some-name
$ jj git push
(The second command will complain that some-name
does not exist remotely, and
then tell you how to fix it. There are flags for specifying which remote to push
to etc.)
Typically in jj you won't have bookmark names ready when you're sending off code reviews. To simplify things jj can generate a bookmark name for you as it pushes.
$ jj git push -c @
Creating bookmark push-sytrsqlnznzr for revision sytrsqlnznzr
Changes to push to origin:
Add bookmark push-sytrsqlnznzr to 5865f9673d0f
This is my primary workflow when working on GitHub, even solo. Pushing changes in a pull request lets the CI run over it.
jj treats history as mutable, making it natural to edit and reorder commits as you work. When collaborating with others, modifying history can be confusing or dangerous.
jj has a notion of "immutable" commits, which is the part of history that should not be modfied. In the default configuration this effectively means code that has been pushed to Git cannot be modified, with the exception of code in tracked bookmarks. This means that you can continue modify a branch after pushing it, for example in response to code reviews. The next push will update it.
There are further safety checks around things like not letting you move a branch backwards (because that would trim off the later commits). In practice I don't understand all the rules, and sometimes it will prompt me to pass a flag to say "I really do mean to do this". It has been fine so far.
(This final section is trivia and only interesting because I came to understand it when writing this post.)
jj commands that accept commits take a "revset" argument, which is the little
language for specifying commits. For example you can say jj diff -r @-
to see
the diff of the previous commit; the @-
expression means "parent of the
current commit". As the name suggests, revsets can refer to sets of commits. (jj
really ought to pick either "commit" or "revision" for talking about things,
it's confusing to have these as synonyms!)
When a bookmark is conflicted, it refers to multiple commits: the commit you had
locally and the commit seen remotely. jj does not let you use it as a revset
directly (I think to prevent accidents?) unless you use the expression
bookmarks(exact:main)
. (The "exact:..." bit is needed because otherwise
bookmarks()
does a substring match.)
Meanwhile, note that the command to create a merge in jj is to create a commit
with multiple parents, jj new parent1 parent2 ...
.
Putting these together, with a conflicting main??
bookmark, you can do:
jj diff -r 'bookmarks(exact:main)'
to show a diff of what the merge of the
two commits would look likejj new 'bookmarks(exact:main)'
to create a merge commit of the two commitsI have never had a reason to need this trivia but it is kind of neat to see how these pieces fit together.