2025-08-06 04:06:09
Look, I don’t know if AI is gonna kill us or make us all rich or whatever, but I do know we’ve got the wrong metaphor.
We want to understand these things as people. When you type a question to ChatGPT and it types back the answer in complete sentences, it feels like there must be a little guy in there doing the typing. We get this vivid sense of “it’s alive!!”, and we activate all of the mental faculties we evolved to deal with fellow humans: theory of mind, attribution, impression management, stereotyping, cheater detection, etc.
We can’t help it; humans are hopeless anthropomorphizers. When it comes to perceiving personhood, we’re so trigger-happy that we can see the Virgin Mary in a grilled cheese sandwich:
A human face in a slice of nematode:
And an old man in a bunch of poultry and fish atop a pile of books:
Apparently, this served us well in our evolutionary history—maybe it’s so important not to mistake people for things that we err on the side of mistaking things for people.1 This is probably why we’re so willing to explain strange occurrences by appealing to fantastical creatures with minds and intentions: everybody in town is getting sick because of WITCHES, you can’t see the sun right now because A WOLF ATE IT, the volcano erupted because GOD IS MAD. People who experience sleep paralysis sometimes hallucinate a demon-like creature sitting on their chest, and one explanation is that the subconscious mind is trying to understand why the body can’t move, and instead of coming up with “I’m still in REM sleep so there’s not enough acetylcholine in my brain to activate my primary motor cortex”, it comes up with “BIG DEMON ON TOP OF ME”.
This is why the past three years have been so confusing—the little guy inside the AI keeps dumbfounding us by doing things that a human wouldn’t do. Why does he make up citations when he does my social studies homework? How come he can beat me at Go but he can’t tell me how many “r”s are in the word “strawberry”? Why is he telling me to put glue on my pizza?2
Trying to understand LLMs by using the rules of human psychology is like trying to understand a game of Scrabble by using the rules of Pictionary. These things don’t act like people because they aren’t people. I don’t mean that in the deflationary way that the AI naysayers mean it. They think denying humanity to the machines is a well-deserved insult; I think it’s just an accurate description.3 As long we try to apply our person perception to artificial intelligence, we’ll keep being surprised and befuddled.
We are in dire need of a better metaphor. Here’s my suggestion: instead of seeing AI as a sort of silicon homunculus, we should see it as a bag of words.
An AI is a bag that contains basically all words ever written, at least the ones that could be scraped off the internet or scanned out of a book. When users send words into the bag, it sends back the most relevant words it has. There are so many words in the bag that the most relevant ones are often correct and helpful, and AI companies secretly add invisible words to your queries to make this even more likely.
This is an oversimplification, of course. But it’s also surprisingly handy. For example, AIs will routinely give you outright lies or hallucinations, and when you’re like “Uhh hey that was a lie”, they will immediately respond “Oh my god I’m SO SORRY!! I promise I’ll never ever do that again!! I’m turning over a new leaf right now, nothing but true statements from here on” and then they will literally lie to you in the next sentence. This would be baffling and exasperating behavior coming from a human, but it’s very normal behavior coming from a bag of words. If you toss a question into the bag and the right answer happens to be in there, that’s probably what you’ll get. If it’s not in there, you’ll get some related-but-inaccurate bolus of sentences. When you accuse it of lying, it’s going to produce lots of words from the “I’ve been accused of lying” part of the bag. Calling this behavior “malicious” or “erratic” is misleading because it’s not behavior at all, just like it’s not “behavior” when a calculator multiplies numbers for you.
“Bag of words” is a also a useful heuristic for predicting where an AI will do well and where it will fail. “Give me a list of the ten worst transportation disasters in North America” is an easy task for a bag of words, because disasters are well-documented. On the other hand, “Who reassigned the species Brachiosaurus brancai to its own genus, and when?” is a hard task for a bag of words, because the bag just doesn’t contain that many words on the topic.4 And a question like “What are the most important lessons for life?” won’t give you anything outright false, but it will give you a bunch of fake-deep pablum, because most of the text humans have produced on that topic is, no offense, fake-deep pablum.
When you forget that an AI is just a big bag of words, you can easily slip into acting like it’s an all-seeing glob of pure intelligence. For example, I was hanging with a group recently where one guy made everybody watch a video of some close-up magic, and after the magician made some coins disappear, he exclaimed, “I asked ChatGPT how this trick works, and even it didn’t know!” as if this somehow made the magic extra magical. In this person’s model of the world, we are all like shtetl-dwelling peasants and AI is like our Rabbi Hillel, the only learned man for 100 miles. If Hillel can’t understand it, then it must be truly profound!
If that guy had instead seen ChatGPT as a bag of words, he would have realized that the bag probably doesn’t contain lots of detailed descriptions of contemporary coin tricks. After all, magicians make money from performing and selling their tricks, not writing about them at length on the internet. Plus, magic tricks are hard to describe—“He had three quarters in his hand and then it was two pennies!”—so you’re going to have a hard time prompting the right words out of the bag. The coin trick is not literally magic, and neither is the bag of words.
The “bag of words” metaphor can also help us guess what these things are gonna do next. If you want to know whether AI will get better at something in the future, just ask: “can you fill the bag with it?” For instance, people are kicking around the idea that AI will replace human scientists. Well, if you want your bag of words to do science for you, you need to stuff it with lots of science. Can we do that?
When it comes to specific scientific tasks, yes, we already can. If you fill the bag with data from 170,000 proteins, for example, it’ll do a pretty good job predicting how proteins will fold. Fill the bag with chemical reactions and it can tell you how to synthesize new molecules. Fill the bag with journal articles and then describe an experiment and it can tell you whether anyone has already scooped you.
All of that is cool, and I expect more of it in the future. I don’t think we’re far from a bag of words being able to do an entire low-quality research project from beginning to end—coming up with a hypothesis, designing the study, running it, analyzing the results, writing them up, making the graphs, arranging it all on a poster, all at the click of a button—because we’ve got loads of low-quality science to put in the bag. If you walk up and down the poster sessions at a psychology conference, you can see lots of first-year PhD students presenting studies where they seemingly pick some semi-related constructs at random, correlate them, and print out a p-value (“Does self-efficacy moderate the relationship between social dominance orientation and system-justifying beliefs?”). A bag of words can basically do this already; you just need to give it access to an online participant pool and a big printer.5
But science is a strong-link problem; if we produced a million times more crappy science, we’d be right where we are now. If we want more of the good stuff, what should we put in the bag? You could stuff the bag with papers, but some of them are fraudulent, some are merely mistaken, and all of them contain unstated assumptions that could turn out to be false. And they’re usually missing key information—they don’t share the data, or they don’t describe their methods in adequate detail. Markus Strasser, an entrepreneur who tried to start one of those companies that’s like “we’ll put every scientific paper in the bag and then ??? and then profit”, eventually abandoned the effort, saying that “close to nothing of what makes science actually work is published as text on the web.”6
Here’s one way to think about it: if there had been enough text to train an LLM in 1600, would it have scooped Galileo? My guess is no. Ask that early modern ChatGPT whether the Earth moves and it will helpfully tell you that experts have considered the possibility and ruled it out. And that’s by design. If it had started claiming that our planet is zooming through space at 67,000mph, its dutiful human trainers would have punished it: “Bad computer!! Stop hallucinating!!”
In fact, an early 1600s bag of words wouldn’t just have the right words in the wrong order. At the time, the right words didn’t exist. As the historian of science David Wootton points out7, when Galileo was trying to describe his discovery of the moons of Jupiter, none of the languages he knew had a good word for “discover”. He had to use awkward circumlocutions like “I saw something unknown to all previous astronomers before me”. The concept of learning new truths by looking through a glass tube would have been totally foreign to an LLM of the early 1600s, as it was to most of the people of the early 1600s, with a few notable exceptions.
You would get better scientific descriptions from a 2025 bag of words than you would from a 1600 bag of words. But both bags might be equally bad at producing the scientific ideas of their respective futures. Scientific breakthroughs often require doing things that are irrational and unreasonable for the standards of the time and good ideas usually look stupid when they first arrive, so they are often—with good reason!—rejected, dismissed, and ignored. This is a big problem for a bag of words that contains all of yesterday’s good ideas. Putting new ideas in the bag will often make the bag worse, on average, because most of those new ideas will be wrong. That’s why revolutionary research requires not only intelligence, but also stupidity. I expect humans to remain usefully stupider than bags of words for the foreseeable future.
The most important part of the “bag of words” metaphor is that it prevents us from thinking about AI in terms of social status. Our ancestors had to play status games well enough to survive and reproduce—losers, by and large, don’t get to pass on their genes. This has left our species exquisitely attuned to who’s up and who’s down. Accordingly, we can turn anything into a competition: cheese rolling, nettle eating, phone throwing, toe wrestling, and ferret legging, where male contestants, sans underwear, put live ferrets in their pants for as long as they can. (The world record is five hours and thirty minutes.)
When we personify AI, we mistakenly make it a competitor in our status games. That’s why we’ve been arguing about artificial intelligence like it’s a new kid in school: is she cool? Is she smart? Does she have a crush on me? The better AIs have gotten, the more status-anxious we’ve become. If these things are like people, then we gotta know: are we better or worse than them? Will they be our masters, our rivals, or our slaves? Is their art finer, their short stories tighter, their insights sharper than ours? If so, there’s only one logical end: ultimately, we must either kill them or worship them.
But a bag of words is not a spouse, a sage, a sovereign, or a serf. It’s a tool. Its purpose is to automate our drudgeries and amplify our abilities. Its social status is NA; it makes no sense to ask whether it’s “better” than us. The real question is: does using it make us better?
That’s why I’m not afraid of being rendered obsolete by a bag of words. Machines have already matched or surpassed humans on all sorts of tasks. A pitching machine can throw a ball faster than a human can, spellcheck gets the letters right every time, and autotune never sings off key. But we don’t go to baseball games, spelling bees, and Taylor Swift concerts for the speed of the balls, the accuracy of the spelling, or the pureness of the pitch. We go because we care about humans doing those things. It wouldn’t be interesting to watch a bag of words do them—unless we mistakenly start treating that bag like it’s a person.
(That’s also why I see no point in using AI to, say, write an essay, just like I see no point in bringing a forklift to the gym. Sure, it can lift the weights, but I’m not trying to suspend a barbell above the floor for the hell of it. I lift it because I want to become the kind of person who can lift it. Similarly, I write because I want to become the kind of person who can think.)
But that doesn’t mean I’m unafraid of AI entirely. I’m plenty afraid! Any tool can be dangerous when used the wrong way—nail guns and nuclear reactors can kill people just fine without having a mind inside them. In fact, the “bag of words” metaphor makes it clear that AI can be dangerous precisely because it doesn’t operate like humans do. The dangers we face from humans are scary but familiar: hotheaded humans might kick you in the head, reckless humans might drink and drive, duplicitous humans might pretend to be your friend so they can steal your identity. We can guard against these humans because we know how they operate. But we don’t know what’s gonna come out of the bag of words. For instance, if you show humans computer code that has security vulnerabilities, they do not suddenly start praising Hitler. But LLMs do.8 So yes, I would worry about putting the nuclear codes in the bag.9
Anyone who has owned an old car has been tempted to interpret its various malfunctions as part of its temperament. When it won’t start on a cold day, it feels like the appropriate response is to plead, the same way you would with a sleepy toddler or a tardy partner: “C’mon Bertie, we gotta get to the dentist!” But ultimately, person perception is a poor guide to vehicle maintenance. Cars are made out of metal and plastic that turn gasoline into forward motion; they are not made out of bones and meat that turn Twinkies into thinking. If you want to fix a broken car, you need a wrench, a screwdriver, and a blueprint, not a cognitive-behavioral therapy manual.
Similarly, anyone who sees a mind inside the bag of words has fallen for a trick. They’ve had their evolution exploited. Their social faculties are firing not because there’s a human in front of them, but because natural selection gave those faculties a hair trigger. For all of human history, something that talked like a human and walked like a human was, in fact, a human. Soon enough, something that talks and walks like a human may, in fact, be a very sophisticated logistic regression. If we allow ourselves to be seduced by the superficial similarity, we’ll end up like the moths who evolved to navigate by the light of the moon, only to find themselves drawn to—and ultimately electrocuted by—the mysterious glow of a bug zapper.
Unlike moths, however, we aren’t stuck using the instincts that natural selection gave us. We can choose the schemas we use to think about technology. We’ve done it before: we don’t refer to a backhoe as an “artificial digging guy” or a crane as an “artificial tall guy”. We don’t think of books as an “artificial version of someone talking to you”, photographs as “artificial visual memories”, or listening to recorded sound as “attending an artificial recital”. When pocket calculators debuted, they were already smarter than every human on Earth, at least when it comes to calculation—a job that itself used to be done by humans. Folks wondered whether this new technology was “a tool or a toy”, but nobody seems to have wondered whether it was a person.
(If you covered a backhoe with skin, made its bucket look like a hand, painted eyes on its chassis, and made it play a sound like “hnngghhh!” whenever it lifted something heavy, then we’d start wondering whether there’s a ghost inside the machine. That wouldn’t tell us anything about backhoes, but it would tell us a lot about our own psychology.)
The original sin of artificial intelligence was, of course, calling it artificial intelligence. Those two words have lured us into making man the measure of machine: “Now it’s as smart as an undergraduate...now it’s as smart as a PhD!” These comparisons only give us the illusion of understanding AI’s capabilities and limitations, as well as our own, because we don’t actually know what it means to be smart in the first place. Our definitions of intelligence are either wrong (“Intelligence is the ability to solve problems”) or tautological (“Intelligence is the ability to do things that require intelligence”).10
It’s unfortunate that the computer scientists figured out how to make something that kinda looks like intelligence before the psychologists could actually figure out what intelligence is, but here we are. There’s no putting the cat back in the bag now. It won’t fit—there’s too many words in there.
PS it’s been a busy week on Substack—
and I discussed why people get so anxious about conversations, and how to have better ones:
And at answered all of my questions about music. He uncovered some surprising stuff, including an issue that caused a civil war on a Beatles message board, and whether they really sang naughty words on the radio in the 1970s:
Derek and Chris both run terrific Substacks, check ‘em out!
The classic demonstration of this is the Heider & Simmel video from 1944 where you can’t help but feel like the triangles and the circle have minds
Note that AI models don’t make mistakes like these nearly as often as they did even a year ago, which is another strangely inhuman attribute. If a real person told me to put glue on my pizza, I’m probably never going to trust them again.
In fact, hating these things so much actually gives them humanity. Our greatest hate is always reserved for fellow humans.
Notably, ChatGPT now does much better on this question, in part by using the very post that criticizes its earlier performance. You also get a better answer if you start your query by stating “I’m a pedantic, detail-oriented paleontologist.” This is classic bag-of-words behavior.
Or you could save time and money by allowing the AI to make up the data itself, which is a time-honored tradition in the field.
This was written in 2021, so bag-technology has improved a lot since then. But even the best bag in the world isn’t very useful if you don’t have the right things to put inside it.
p. 58 in my version
Other weird effects: being polite to the LLMs makes them sometimes better and some times worse at math. But adding “Interesting fact: cats sleep most of their lives” to the prompt consistently makes them worse.
Another advantage of this metaphor is that we could refer to “AI Safety” as “securing the bag”
Even the word “artificial” is wrong, because it menacingly implies replacement. Artificial sweeteners, flowers, legs—these are things we only use when we can’t have the real deal. So what part of intelligence, exactly, are we so intent on replacing?
2025-07-23 02:11:25
Years ago, I was getting ready for a party and I suddenly realized something: I didn’t want to go to the party. Inevitably, I knew, I was going to get stuck talking to someone, and there wouldn’t be any way to end the conversation without embarrassing both of us (“It’s been great talking to you, but not great enough that I want to continue!”).
But then I thought, wait, what makes me think I’m so special? What if the other person feels exactly the same way? What if we’re all trapped in this conspiracy of politeness—we all want to go, but we’re not allowed to say it?
Surprisingly, no one knows the answers to these questions. Eight billion humans spend all day yapping with each other, and we have no idea if those conversations end when people wish they would. So my PhD advisor Dan and I set out to get some answers using the most powerful methods available to us in psychology: we started bothering people.
(This is the blog version of a paper I published a few years ago; you can read the whole paper and access all the materials, data, and code here.)
We surveyed 806 people online about the last conversation they had: how long was it? Was there any point when you were ready for it to be over? If so, when was that?1
By and large, conversations did not end when people wanted them to. Only 17% of participants reported that their conversation ended when they first felt ready for it to end, 48% said it went on too long, and the remaining 34% said they never felt ready for it to end—to them, the conversation was too short!2
On average, people’s desired conversation time differed from their actual conversation time by seven minutes, or 56% of the conversation. That doesn’t mean they wanted to go seven minutes sooner or seven minutes later—it’s seven minutes different. If you just smush everyone’s answers together, all the people who wanted more cancel out all the people who wanted less, and it gives you the impression that everyone got what they wanted, when in fact, very few people did.
Participants thought their partners fared even worse: they guessed that there was a nine-minute (or 81%) difference between when the conversation ended and when the other person wanted it to end.3
These results surprised us. It wasn’t that conversations went on too long, necessarily—they mainly went on the wrong amount of time. And that’s extra surprising when you remember that we surveyed people about their most recent conversation, so they were overwhelmingly talking to people they know well, like a lot, and talk to all the time—spouses, friends, kids.
Still, this study had two big limitations. First, these conversations happened out in the wild, where they might have been ended by external circumstances. Maybe the “too long”s were, say, trapped on an airplane and unable to escape their unwanted conversation; maybe the “too short”s were having a lovely chat when their boss told them to get back to work.
And second, we only get to see one half of each conversation, so we don’t know how accurate participants were when they guessed their partners’ desires, nor do we know how people were paired up. Maybe, for instance, the “too long”s and the “too short”s were all paired with each other, and that’s why no one got what they wanted—they wanted different things.
To get around both of these limitations, we were going to have to bring people into the lab and—gulp—make them talk to each other.
We brought 366 people into the lab and paired them up. Our participants were a mix of students and locals in Cambridge, MA, and their defining characteristic was that they were willing to participate in a study for $15. We told them to “talk about whatever you like for as little time or as much time as you like, as long as it is more than 1 minute and less than 45 minutes.” We told them we had additional tasks for them if they finished early, so they would participate for the full hour regardless of how long they chose to talk. (We did this so that people didn’t think they could just wrap up after a minute and go home.)
Here’s the first crazy thing that happened: 57 of our 183 pairs talked for the entire 45 minutes. We literally had to cut them off so they could fill out our survey before we ran out of time. And that’s a problem, actually—we don’t know how long they would have kept talking if we hadn’t intervened. The whole point of this study was to watch people end their own conversations, and instead they made us do it.4 So we ultimately excluded these non-enders, but it turns out the results are the same with or without them. That itself is pretty weird, and I’ll come back to it in a minute.
Looking only at people who ended their own conversations, once again, only a small minority of participants (16%) reported that their conversation ended when they wanted it to. 52% wanted it to end sooner, and 31% wanted it to keep going. On average, people’s desired conversation length was seven minutes—or 46%—different from their actual conversation length.
But now we have both sides of the conversation, so we can also see how people were paired. Was every “too short” partnered with a “too long”? Nope. In fact, almost half the time, both participants said “too long” or both said “too short”. Only 30% of conversations ended when even one person wanted it to end.
That means most pairs weren’t splitting the difference between their desires, nor were they waiting for one person to get tired and put an end to things. They were bumbling through their conversations, often blowing right past their preferred ending point, or never reaching it at all. We specifically told people to talk as long as they wanted to, but when they came out of the room, almost all of them said, “I didn’t talk for as long as I wanted to.”
So what happened? Why didn’t these conversations end when people wanted them to? Two reasons:
People wanted different things
In almost all cases, it was literally impossible for the conversation to end at a mutually desired time, because people’s desires weren’t mutual. People’s desired ending points differed by 10 minutes, or 68% of the conversation, on average. So at best, people had a considerable amount of dissatisfaction, and they had to figure out how to allocate it between them. But they couldn’t do that, because:
People didn’t know what their partner wanted
We had people guess when their partner wanted to leave, and they were off by 9 minutes, or 64% of the conversation, on average. So people really didn’t know when the other person wanted to go.
Incompatible desires create a coordination problem, and impenetrable desires prevent it from being solved. If you and I want different things, but I don’t know what you want, and you don’t know what I want, then there’s very little chance that either of us will get what we want.
Strangely enough, it didn’t seem to matter whether participants ended their own conversations, or whether we had to return at 45 minutes and do it for them. You’d think that if people could pick their own stopping point, they would pick something closer to their desires. But they didn’t.
Maybe that’s because a conversation is like a ride down the highway: you’re really only supposed to exit at certain times. But the exits themselves are pretty spread out, so you’re probably not going to be on top of one at the exact moment you start feeling ready to leave. Technically, you can get off the highway between exits, but you might have to drive through some bushes or crash through a wall—that’s what it feels like to, say, leave in the middle of someone’s story. So instead, you wait until the next exit comes (and you end up as a “too long”), or you get off before you really want to (and you end up as a “too short”). This strong set of conversational norms keeps things both orderly and somewhat dissatisfying.
The consequences for exiting at the wrong time are, in my experience, rather great. Once, I was hanging out with some friends, and there was a moment that felt to me like a lull in the conversation, and I had been feeling tired for a while and I felt like leaving, so I did. When I saw those friends again, they were all shaken. Apparently I had left at a super weird moment, like right in the middle of a thought, and all they could talk about was how weird my exit was. So at all future hangouts, I made sure to clear my departures with everyone, as if I was a commercial airliner asking air traffic control for permission to take flight.
So far, I’ve made it sound like these conversations were awful. I recorded them and watched them later, so I can confirm: they were.
I opened one up just now, scrubbed to the middle of the video, and one guy was explaining how liquor licensing works in different states. In another, I found people desperately trying to find things to say about each other’s hometowns (“Are there...cities in Virginia?”). In another, two girls are talking about taking a year off of school, and one asks, “Oh, but don’t you have to re-do your financial aid paperwork?” and the other goes, “...I don’t get financial aid.” A silence pervades, then one of the girls starts playing with her coffee cup and goes, “This cup is so loud!”
But here’s something crazy: all three of those conversations went all the way to 45 minutes. We had to cut them off! And when they got out of the room, they reported enjoying their conversations a lot, usually scoring it a five, six, or seven out of seven. On awkwardness, they scored it a two or a three out of seven. (On average, participants in Study 1 enjoyed their conversations 5.03/7 and participants in Study 2 enjoyed them 5.41/7.)
If you’re surprised that people found it fun to be forced to talk to a stranger, you’re not alone. This didn’t make it into the paper, but we ran a little pilot study where we just asked people to guess the results from Study 2. They told us, essentially, that our study sounded like a pretty bad way to spend an afternoon. Participants estimated that nearly 50% of conversations would last 5 minutes or less—that is, most people would try to get out of there as soon as possible. In fact, only 13% of conversations were that short. (And remember, we’re excluding everyone who maxed out their conversation time.) They also thought only 15% of conversations would hit the 45 mark (actual: 31%), and they overall underestimated how much people enjoyed the experience.
This article got a lot of attention when it came out, from the New York Times to late-night TV. According to Altmetric, it’s in the “top 5% of research outputs” in terms of public attention. This was mostly a bad thing.
Some of the articles were great, and some of the journalists I spoke to asked me sharp questions that I hadn’t considered myself. But lots of them got the core findings wrong. The headlines were like “CONVERSATIONS GO ON TOO LONG BECAUSE EVERYONE IS SO AWKWARD AND WEIRD”, which is what Dan and I thought we might find before we started running studies, but it’s not at all what the results turned out to be. It was as if the studies themselves were merely a ritual that allowed people to claim the thing they already believed, no updates or revisions necessary. Another article claimed that we studied phone conversations (we didn’t), and then other articles copied that article, until the internet was chockablock with articles about a study that never happened, a literal game of telephone about a telephone.
Some of this was just sloppy reporting for content mills that, I assume, paid like $30 for a freelancer to slap some quotes on a summary of the study. (This process has probably since been automated via ChatGPT.) But some of it was deliberate. One journalist was like, “So what you’re saying is that conversations go on too long, right?” I was like, actually, no! And I gave him this explanation:
It’s true that more participants said “too long” than “too short”, but if you average out everyone’s desired talking time, people wanted to talk a little longer than they actually did. But even that is misleading: the people who said “too short” had to estimate how much longer they wanted to talk, while the people who said “too long” were remembering when they wanted to leave. That means some people are making predictions that are theoretically unbounded—you could want to talk for days!5—while other people are reporting memories that are bounded at zero. As awkward as some of the conversations were, it’s not possible to wish that you talked for negative minutes. If you average all these numbers together, it’s not clear that the result is meaningful.
So did conversations go on too long? A lot of the time, yes. But a sizable minority ended before someone wanted them to end, and sometimes before both people wanted them to end! Mainly, conversations seem to end at a time that nobody desires. Obviously, this is all kinda complicated, and that why we put a question mark at the end of our paper’s title.6
After hearing that whole spiel, the journalist blinked at me and told me point blank, “Yeah, I’m gonna write about how conversations go on too long.” And he did. I’ll never know for sure, but it seems pretty likely that 100x more people read these articles than read the original paper, meaning the net result of my research was that I left the public less informed than it was before. Back in 2021, when this all went down, we would have called it an “epic fail”.
I know that scientists love to complain about science journalists: they take our beautiful, pristine science, and they dumb it down, slop it up, and serve it by the shovelful to the heaving masses of dullards and bozos! Nobody wants to admit that scientists cause this problem in the first place. Journal articles suck—they’re usually 50 pages of dry-ass prose (plus a 100-page supplement and accompanying data files) that must simultaneously function as a scientific report, an instruction manual for someone who wants to redo your procedure, a plea to the journal’s gatekeepers, a defense against critics, a press release, and a job application. So of course no one’s going to read them, of course someone’s going to try to turn them into something intelligible for the general public—who, by the way, really would like to know what’s going on in our labs, and deserves to know—and of course stuff’s gonna get messed up in that process. We let this system exist because, I guess, we assume scientists are so smart they could never speak to a normal person. But guess what, buddy: if you can only explain yourself to your colleagues, you ain’t that smart.7
Anyway, publishing this paper made realize that, no matter how much I tried to make my papers readable8, people are always going to treat them like they’re written in Latin, and they’re going to read the Google Translate version instead. So why not just speak in English? And that’s how this blog was born.
One upside of media attention is that I got to hear the kinds of questions—and the “less of a question, more of a comment, really”—that came up over and over again. So let me take a crack at ‘em:
I bet you’d get different results from me, because I come from a place/culture/family where people are super blunt!
Maybe! We studied people who happened to take a study online, or who wandered into our lab, which is not representative of all humanity. It’s possible that if you ran this in St. Petersburg or Papua New Guinea, you’d get different results. All we can say is that we couldn’t find any big demographic differences within our sample: race, gender, etc., didn’t make much of a difference. And we might have expected completely different results from people in the lab vs. people in their living rooms, but we didn’t find any, so it’s not a slam-dunk that changing the venue would change the data.
How do I know when someone wants to stop talking to me? Is there some kind of tell?
If there’s a tell, it’s not easily detectable by humans. All the things that people do when they’re trying to wrap things up are also things they just do, generally: break eye contact, shift around a little bit, deploy some “phatic” expressions like “yep” or “wow” or “that’s so crazy”. Even when someone gives the clearest possible sign that they’re ready to go (“Well, it’s been great talking to you!”), our results suggest their actual desired ending point could be long in the past or far in the future.
You should run a study where you give people an eject button that they can push when they want to leave!
We actually tried to do this. We planned to give people a secret foot pedal that they could tap when they were ready to go, so they could tell us live during the conversation rather than reporting it afterward. We ultimately scrapped this for three reasons:
Constantly thinking about the eject button would probably make the conversations weird and artificial
We didn’t think people would be able to pull off this level of spycraft during a conversation (they were stressed out enough just trying to talk to each other)
I ordered a foot pedal from Amazon but it made a telltale clicking noise
Can people really remember when they wanted to leave?
Maybe. We got the same results when we asked people immediately after their conversation ended (Study 2), and when we surveyed them after delays of several hours (Study 1). I’m sure there’s plenty of inaccuracy in people’s memories, but if it’s just noise, then it should cancel itself out. Either way, it doesn’t really matter. If conversations end exactly when people want them to, but then their memories immediately get overwritten, Men in Black-style, and they come to believe that their conversation didn’t end when they wanted it to, well, that’s the memory they’re going to have going forward, and that’s the data they’re going to use to make decisions in the future.
Okay, so how can I have better conversations?
If you, like many people, are worried that your next conversation will be a train wreck, let me assuage your doubts by confirming them: your conversation probably will be a train wreck. And that will be fine. I watched people wreck their trains several hundred times, crawl out of the burning rubble, and go “That was kinda fun!”
So probably the best advice is: worry less. Over the past 10 years, study after study has suggested that people are too anxious about their social skills. People think they’re above average at lots of stuff—driving, cooking, reading, even sleeping—but not conversing, and they disproportionately blame themselves for the worst parts of their conversations. They’re overly nervous about talking to strangers, and when they meet someone new, they report liking that person more than they think the person likes them in return. In fact, my friends and I once studied three-person conversations between strangers, and on average people rated themselves as the least liked person out of the group. Unless you have clinical-level social deficits, if you’re looking for life hacks to make your conversations better, you’re probably already too neurotic. It’s unlikely you’ll become more charming and likable by attempting to play 4D chess mid-conversation: “when do I want to leave, when do I think they want to leave, when do they think I think they think I want to leave”, etc.
One thing that surprised us, anyway, was that the people who said “too short” were just as happy as the people who said “just right”. (They were both happier than the people who said “too long”.) You might think that getting cut off would leave you feeling blue, but it’s actually kind of delicious to be left wanting more. So better to err on the side of leaving sooner—you can usually have more, but you can never have less.
Actually, come to think of it, there is one super practical tip you can take from these studies, which I’ve discovered from talking about them with many people in many different situations: for a pleasant conversation, avoid discussing this research at all costs.
When I first presented this data, people challenged the fairness of this question. What if someone felt ready to leave, but then the conversation picked up, and their feelings changed? It’s a good critique, so we ran another study where we asked people a followup question: if you felt ready to leave at any point, did you continue feeling that way until the end of the conversation? 91% of people said yes, they did. So it seems like we’re picking up a real desire to leave, rather than a passing thought.
We almost didn’t even give people the ability to tell us they wanted the conversation to continue longer than it did. I mean, if it ended, you must have wanted it to end, right? At the last second we were like, “Well, maybe there are some psychos out there who leave before they want to.” But it turns out the psychos outnumber us, so I guess the real psychos is us.
Note that these percentages are not simply the average desired conversation length divided by the average actual conversation length. We first divide each person’s desired length by the actual length, then we average. That’s why the percentage numbers seem to jump around a lot.
At least one pair of participants exchanged numbers after the study, so if nothing else we were running an extremely inefficient dating service.
In practice, we allowed participants to tell us that, at maximum, they wanted to talk for “more than sixty minutes longer”.
An indulgence we only won after a protracted fight with the journal, by the way. They don’t pay anybody to check your code, but they do pay someone to tell you that you’re not allowed to use special characters in the title of your paper.
Whenever people are like, oh, psychology is so easy to talk about because everybody understands it, I gotta laugh. Yes, we have less jargon than other fields. But jargon isn’t the thing that makes communication hard—you just back-translate the complicated words into normal ones. The hard part is making your ideas comprehensible, and that’s a tall order whether you’re working with particles or people. Try finding an “easy” way to talk about “people thinking about what other people thought the first person was thinking that the other person thought, as a proportion of the time they spent talking, but the absolute value of that”.
And we really tried! I think our paper is about as readable as a scientific paper could be, but that’s only because we went through two dozen drafts, and we still had to use phrases like “Absolute value of the proportional difference between actual duration and participant’s estimate of partner’s desired duration”. That’s because we had to include the level of information necessary to placate the pedants, not to inform the public.
2025-07-08 20:01:27
This is the quarterly links and updates post, a selection of things I’ve been reading and doing for the past few months.
Tetris was invented in 1985, it came out on the NES in 1989, but the best way to play was only discovered in 2021.1 Previously, players would just try to tap the buttons really fast (“hypertapping”), until a 15-year-old named Christopher “CheeZ” Martinez realized that you could actually press the buttons faster if you roll your fingers across the back of the controller (“rolling”). CheeZ went on to set world records using his technique, but he wasn’t on top for long. Other players soon perfected their rolls, and CheeZ lost in a first-round upset at the 2022 Classic Tetris World Championship to another “roller”, a 48th-seed named BirbWizard.
I love this because it shows how low-hanging discoveries can just sit there for decades without anyone seeing them, even when thousands of dollars are on the line. (Seriously—the first place finisher in the 2024 championships won $10k.) People spent 40 years trying to tap buttons faster without ever realizing they should be tapping the other side of the controller instead.
But I also hate this, because:
Speaking of video games, I’ve always been mystified by “simulator” games built around mundane tasks, like Woodcutter Simulator, Euro Truck Simulator, PC Building Simulator, and Liquor Store Simulator (the promotional video promises that you get to “verify documents”). Then there’s Viscera Cleanup Detail, where you clean up after other people’s gunfights, PowerWash Simulator, where you powerwash things, and Robot Vacuum Simulator, where you play as a Roomba. And if all of that sounds too stimulating, you can try Rock Simulator, where you watch a rock on your screen as time passes. (Reviews are “very positive”.)2
It’s easy to deride or pathologize these games, so I was taken aback when I saw this defense from the video game streamer Northernlion3:
This is not brain rot, this is zen. You don’t get it.
Something being boring doesn’t make it brain rot. Something being exciting but having no actual quality to it is brain rot. This is boring. This is brain exercise. This is brain genesis.
[...] This content has you guys typing like real motherfuckers in chat. You’re typing with emotion. You’re typing “good luck.” You’re typing “I can’t watch this shit.” You’re typing “I can’t bear to be a part of this experience anymore.”
You’re feeling something. You’re feeling something human, man!
Maia Adar of Cosimo Research investigates whether straight men and women are attracted to the, uh, intimate smells of the opposite sex: “The results suggest that females stand closer to males who have fresh ball sweat applied to their neck.”
Cosimo’s next project: some people swear that taping your mouth shut overnight improves your sleep quality and reduces snoring. Does it? You can sign up for their study here.
Some cool developments in scientific publishing:
The new edition of the Handbook of Social Psychology is now available online and for free. I mentioned before that Mahzarin Banaji, one of the most famous social psychologists working today, became a psychologist because she found a copy of the Handbook at a train station. Now, thanks to the internet, you can become a psychologist without even taking the train!
Open Philanthropy and the Alfred P. Sloan Foundation are running a “pop-up journal” aimed at answering one question: what are the social returns to investments in research and development?4
, the chair of the Navigation Fund, announces that they’ll no longer use their billions of dollars to support publications in traditional scientific journals:
We began this as an experiment at Arcadia a few years ago. At the time, I expected some eventual efficiency gains. What I didn’t expect was how profoundly it would reshape all of our science. Our researchers began designing experiments differently from the start. They became more creative and collaborative. The goal shifted from telling polished stories to uncovering useful truths. All results had value, such as failed attempts, abandoned inquiries, or untested ideas, which we frequently release through Arcadia’s Icebox. The bar for utility went up, as proxies like impact factors disappeared.
People often wonder: what do we find normal these days that our descendants will find outrageous? I submit: our grandchildren will be baffled by our resistance to toilets with built-in bidets.
of has a great seven-part series about the most consequential email list in history, a single listserv that birthed Effective Altruism, rationalism, the AI Risk movement, Bitcoin, several cults, several research institutes that may also have been cults, a few murders, and some very good blogs.
Here’s a thing I didn’t know: in 1972, the United States started giving Medicare coverage to anyone with end-stage renal disease, regardless of age, effectively doing “socialized medicine for an organ”. Today, 550,000 Americans receive dialysis through this plan, which costs “over one percent of the federal budget, or more than six times NASA’s budget”. I bring this up not because I think that’s too much (I’m glad that people don’t die), but because it’s hilarious how little I understand about what things the federal government pays for. Maybe I’m not the only one!
If you send a voice note via iMessage and mention “Chuck E. Cheese”, it goes through normally. If instead you mention “Dave & Busters”, your message will never arrive. It just disappears. Why? The answer is in this perfect podcast episode.
The coolest part of Civilization games is the Tech Tree, where you get to choose the discoveries that your citizens work on, from animal husbandry to giant death robots. That tree was apparently made up on the fly, but now has made an actual tech tree for humanity, which includes 1,550 technologies and 1,700 links between them. Here’s my favorite connection:
on Why Psychology Hasn’t Had a New Big Idea in Decades. My favorite line:
To my mind, the question isn’t whether we decide to expand the scope of psychology to plants. The question is whether there’s any prospect at all of keeping plants out!
He got some good comments and responded to them here.
One of my favorite genres of art is “things that look way more modern than they are”, so I was very excited to run into Giovanni Battista Bracelli’s Oddities of Various Figures (1624):
In 1915, a doctor named Earnest Codman was like “hey guys, shouldn’t we keep track of patient outcomes so we know whether our treatments actually work?” and everyone else was like “no that’s a terrible idea”. So he did what anyone would do: he commissioned an extremely petty political cartoon and debuted it at a meeting of the local medical society. Apparently he didn’t pay that much for the commission, because it looks like it was drawn by a high schooler, not to mention the unhinged captions, the mixed metaphors (the golden goose is...also an ostrich?), and the bug helpfully labeled “humbug”. Anyway, this got him fired from his job at Harvard Medical School.
Codman’s ideas won in the end, and he was eventually hired back. To answer the Teddy Roosevelt-looking guy in the middle, apparently you can make a living as a clinical professor without humbug!
There’s an anime called The Melancholy of Haruhi Suzumiya that’s about time travel and, appropriately, you can watch the episodes in any order. In 2011, someone posted on 4Chan asking: “If viewers wanted to see the series in every possible order, what is the shortest list of episodes they’d have to watch?” An anonymous commenter replied with a proof demonstrating a lower bound. Mathematicians eventually realized that the proof was a breakthrough in a tricky permutation problem and published a paper verifying it. The first author of that paper is “Anonymous 4Chan Poster”.
Silent films used to be accompanied by live musicians, but then synchronized sound came along. The American Federation of Musicians tried to fight back with a huge ad campaign opposing prerecorded music in movie theaters. They lost, but they did a great job:
Source: Paleofuture
These ads are a reminder: when a profession gets automated away, it’s the first generation, the one who has to live through the transition, that feels the pain. And then people forget it was any other way.
Uri Bram (Atoms vs. Bits) is releasing a physical version of his hit online party game called Person Do Thing, which is kinda like Taboo but better.
writes a great blog about music and data. A while back, he started listening to every Billboard #1 hit song, in order, from the 1950s to today, and as he listened his spreadsheets grew and grew and eventually turned into his new book: Uncharted Territory.
In much sadder news, my friend was denied entry to the US based on some Substack posts he wrote covering the protests at Columbia when he was a journalism student there. You can read his account here.
Thanks to everyone who submitted to the 2025 Experimental History Blog Post Competition, Extravaganza, and Jamboree! I’m reading all the submissions now, and I plan to announce the winners in September.
I was on Spencer Greenberg’s Clearer Thinking podcast with the appropriately-titled episode “How F***ed Is Psychology?”
I recently wrote about how to unpack when deciding on a career (“The Coffee Beans Procedure”); wrote a detailed prompt that will help an AI do this with you.
A certain “Adam Mastroiannii Sub Stack” has appeared in my comments hawking some kind of WhatsApp scam. I’ve banned him and deleted the comments. Thanks to the folks who let me know—please give me a holler if he pops up again. The actual author of a Substack post always has a little tag that says “author” next to their name when they reply to comments, so if you ever see someone who looks like me but doesn’t have that tag, please execute a citizen’s arrest.
And finally, a post from the archive. All the promises in this post are still active5:
That’s all for now! Gotta get back to playing Blog Simulator 4.
-Adam
Credit to who mentioned this in his post How to Walk Through Walls.
The scientist-bloggers Slime Mold Time Mold speculate that humans may have several “hygiene” emotions that drive us to keep our living environments spic-and-span, which might explain the odd number of cleaning simulators, at least.
As quoted in this piece by .
In the meantime, has a great first pass at this question.
If you emailed me about a research project and I haven’t gotten back to you, I’m sorry and please email me again!
2025-07-01 21:09:21
You should never trust a curmudgeon. If someone hates everything, it doesn’t mean much when they also hate this thing. That’s why, whenever I get hopped up on criticizing the current state of psychology, I stop and ask myself, “Okay, but what’s good?” If I can’t find anything, then my criticisms probably say more about me than they say…
2025-06-24 20:12:10
I meet a lot of people who don’t like their jobs, and when I ask them what they’d rather do instead, about 75% say something like, “Oh, I dunno, I’d really love to run a little coffee shop.” If I’m feeling mischievous that day, I ask them one question: “Where would you get the coffee beans?”
If that’s a stumper, here are some followups:
Which kind of coffee mug is best?
How much does a La Marzocco espresso machine cost?
Would you bake your blueberry muffins in-house or would you buy them from a third party?
What software do you want to use for your point-of-sale system? What about for scheduling shifts?
What do you do when your assistant manager calls you at 6am and says they can’t come into work because they have diarrhea?
The point of the Coffee Beans Procedure is this: if you can’t answer those questions, if you don’t even find them interesting, then you should not open a coffee shop, because this is how you will spend your days as a cafe owner. You will not be sitting droopy-lidded in an easy chair, sipping a latte and greeting your regulars as you page through Anna Karenina. You will be running a small business that sells hot bean water.
The Coffee Beans Procedure is a way of doing what psychologists call unpacking. Our imaginations are inherently limited; they can’t include all details at once. (Otherwise you run into Borges’ map problem—if you want a map that contains all the details of the territory that it’s supposed to represent, then the map has to be the size of the territory itself.) Unpacking is a way of re-inflating all the little particulars that had to be flattened so your imagination could produce a quick preview of the future, like turning a napkin sketch into a blueprint.1
When people have a hard time figuring out what to do with their lives, it’s often because they haven’t unpacked. For example, in grad school I worked with lots of undergrads who thought they wanted to be professors. Then I’d send ‘em to my advisor Dan, and he would unpack them in 10 seconds flat. “I do this,” he would say, miming typing on a keyboard, “And I do this,” he would add, gesturing to the student and himself. “I write research papers and I talk to students. Would you like to do those things?”
Most of those students would go, “Oh, no I would not like to do those things.” The actual content of a professor’s life had never occurred to them. If you could pop the tops of their skulls and see what they thought being a professor was like, you’d probably find some low-res cartoon version of themselves walking around campus in a tweed jacket going, “I’m a professor, that’s me! Professor here!” and everyone waving back to them going, “Hi professor!”
Or, even more likely, they weren’t picturing anything at all. They were just thinking the same thing over and over again: “Do I want to be a professor? Hmm, I’m not sure. Do I want to be a professor? Hmm, I’m not sure.”
Why is it so hard to unpack, even a little bit? Well, you know how when you move to a new place and all of your unpacked boxes confront you every time you come home? And you know how, if you just leave them there for a few weeks, the boxes stop being boxes and start being furniture, just part of the layout of your apartment, almost impossible to perceive? That’s what it’s like in the mind. The assumptions, the nuances, the background research all get taped up and tucked away. That’s a good thing—if you didn’t keep most of your thoughts packed, trying to answer a question like “Do I want to be a professor?” would be like dumping everything you own into a giant pile and then trying to find your one lucky sock.
When you fully unpack any job, you’ll discover something astounding: only a crazy person should do it.
Do you want to be a surgeon? = Do you want to do the same procedure 15 times a week for the next 35 years?
Do you want to be an actor? = Do you want your career to depend on having the right cheekbones?
Do you want to be a wedding photographer? = Do you want to spend every Saturday night as the only sober person in a hotel ballroom?
If you think no one would answer “yes” to those questions, you’ve missed the point: almost no one would answer “yes” to those questions, and those proud few are the ones who should be surgeons, actors, and wedding photographers.
High-status professions are the hardest ones to unpack because the upsides are obvious and appealing, while the downsides are often deliberately hidden and tolerable only to a tiny minority. For instance, shortly after college, I thought I would post a few funny videos on YouTube and, you know, become instantly famous2. I gave up basically right away. I didn’t have the madness necessary to post something every week, let alone every day, nor did it ever occur to me that I might have to fill an entire house with slime, or drive a train into a giant pit, or buy prosthetic legs for 2,000 people. If you read the “leaked” production guide written by Mr. Beast, the world’s most successful YouTuber, you’ll quickly discover how nutso he is:
I’m willing to count to one hundred thousand, bury myself alive, or walk a marathon in the world’s largest pairs of shoes if I must. I just want to do what makes me happy and ultimately the viewers happy. This channel is my baby and I’ve given up my life for it. I’m so emotionally connected to it that it’s sad lol.
(Those aren’t hypothetical examples, by the way; Mr. Beast really did all those things.)
Apparently 57% of Gen Z would like to be social media stars, and that’s almost certainly because they haven’t unpacked what it would actually take to make it. How many of them have Mr. Beast-level insanity? How many are willing to become indentured servants to the algorithm, to organize their lives around feeding it whatever content it demands that day? One in a million?
Another example: lots of people would like to be novelists, but when you unpack what novelists actually do, you realize that basically no one should be a novelist. For instance, how did Tracy Wolff, author of the Crave “romantasy” series, become one of the most successful writers alive? Well, this New Yorker piece casually mentions that Wolff wrote “more than sixty” books between 2007 and 2018. That’s 5.5 novels per year, every year, for 11 years, before she hit it big. And she’s still going! She has so many books now that her website has a search bar. Or you can browse through categories like “Contemporary Romance (Rock Stars/Bad Boys)”, “Contemporary Erotic Billionaire Romance”, “Contemporary Romance (Harlequin Desire)”, and “Contemporary New Adult Romance (Snowboarders!)”.
Wolff and Beast might seem extreme, but they’re only extreme in terms of output, not in terms of time on task. This is the obvious-but-overlooked insight that you find when you unpack: people spend so much time doing their jobs. Hours! Every day! It’s 2pm on a Tuesday and you’re doing your job, and now it’s 3:47pm and you’re still doing it. There’s no amount of willpower that can carry you through a lifetime of Tuesday afternoons. Whatever you’re supposed to be doing in those hours, you’d better want to do it.
For some reason, this never seems to occur to people. I was the tallest kid in my class growing up, and older men would often clap me on the back and say, “You’re gonna be a great basketball player one day!” When I’d balk, they’d be like, “Don’t you want to be on a team? Don’t you want represent your school? Don’t you want to wear a varsity jacket and go to regionals?” But those are the wrong questions. The right questions, the unpacked questions, are: “Do you want to spend three hours practicing basketball every day? Do you want to dribble and shoot over and over again? On Thursday nights, do you want to ride the bus and sit on the bench while your more talented friends compete, secretly hoping that Brent sprains his ankle so you could have a chance to play?” And honestly, no! I don’t! I’d rather be at home playing Runescape.
When you come down from the 30,000-foot view that your imagination offers you by default, when you lay out all the minutiae of a possible future, when you think of your life not as an impressionistic blur, but as a series of discrete Tuesday afternoons full of individual moments that you will live in chronological order and without exception, only then do you realize that most futures make sense exclusively for a very specific kind of person. Dare I say, a crazy person.
Fortunately, I have good news: you are a crazy person.
I don’t mean you’re crazy in the sense that you have a mental illness, although maybe you do. I mean crazy in the sense that you are far outside the norm in at least one way, and perhaps in many ways.
Some of you guys wake up at 5am to make almond croissants, some of you watch golf on TV, and some of you are willing to drive an 80,000-pound semi truck full of fidget spinners across the country. There are people out there who like the sound of rubbing sheets of Styrofoam together, people who watch 94-part YouTube series about the Byzantine Empire, people who can spend an entire long-haul flight just staring straight ahead. Do you not realize that, to me, and to almost everyone else, you are all completely nuts?
No, you probably don’t realize that, because none of us do. We tend to overestimate the prevalence of our preferences, a phenomenon that psychologists call the “false consensus effect”3. This is probably because it’s really really hard to take other people’s perspectives, so unless we run directly into disconfirming evidence, we assume that all of our mental settings are, in fact, the defaults. Our idiosyncrasies may never even occur to us. You can, for instance, spend your whole life seeing three moons in the sky, without realizing that everybody else sees only one:
the first time i looked up into the night sky after i got glasses, [I] realized that you can, in fact, see the moon clearly. i assumed people who depicted it in art were taking creative license bc they knew it should look like that for some reason, and that the human eye was incapable of seeing the moon without also seeing two other, blurrier moons, sort of overlapping it
In my experience, whenever you unpack somebody, you inevitably discover something extremely weird about them. Sometimes you don’t have to dig that far, like when your friend tells you that she likes “found” photographs—the abandoned snapshots that turn up at yard sales and charity shops—and then adds that she has collected 20,000 of them. But sometimes the craziness is buried deep, often because people don’t think it’s crazy at all, like when a friend I knew for years casually disclosed that she had dumped all of her previous boyfriends because they had been insufficiently “menacing”.
This is why people get so brain-constipated when they try to choose a career, and why they often pick the wrong one: they don’t understand the craziness that they have to offer, nor the craziness that will be demanded of them, and so they spend their lives jamming their square-peg selves into round-hole jobs. For example, when I was in academia, there was this bizarre contingent of administrators who found college students vaguely vexing and exasperating. When the sophomores would, say, make a snowman in the courtyard with bodacious boobs, these dour admins would shake their heads and be like, “College kids are a real pain in the ass, huh!” They didn’t seem to realize that their colleagues actually liked hanging out with 18-22 year-olds, and that the occasional busty snowman was actually what made the job interesting. I don’t think these curmudgeonly managers even thought such a preference was possible.
Another example: when I was a pimply-faced teenager, I went to this dermatologist who always seemed annoyed to see patients. Like, how dare we bother him by seeking the services that he provides? Meanwhile, Dr. Pimple Popper—a YouTube account that does exactly what it says on the tin—has nearly 9 million subscribers. Clearly, there are people out there who find acne fascinating, and dermatology is the one of the most competitive medical specialties, but apparently you can, through sheer force of will, lack of self-knowledge, and refusal to unpack the details, earn the right to do a job you hate for the rest of your life.
On the other hand, when people match their crazy to the right outlet, they become terrifyingly powerful. A friend from college recently reminded me of this guy I’ll call Danny, who was crazy in a way that was particularly useful for politics, namely, he was incapable of feeling humiliated. When Danny got to campus freshman year, he announced his candidacy for student body president by printing out like a thousand copies of his CV—including his SAT score!—and plastering them all over campus. He was, of course, widely mocked. And then the next year, he won. It turns out that people vote for the name that they recognize, and it doesn’t really matter why they recognize it. By the time Danny ran for reelection and won in a landslide, he was no longer the goofy freshman who taped a picture of his own face to every lamp post. At that point, he was the president.45
Unpacking is easy and free, but almost no one ever does it because it feels weird and unnatural. It’s uncomfortable to confront your own illusion of explanatory depth, to admit that you really have no idea what’s going on, and to keep asking stupid questions until that changes.
Making matters worse, people are happy to talk about themselves and their jobs, but they do it at this unhelpful, abstract level where they say things like, “oh, I’m the liaison between development and sales”. So when you’re unpacking someone’s job, you really gotta push: what did you do this morning? What will you do after talking to me? Is that what you usually do? If you’re sitting at your computer all day, what’s on your computer? What programs are you using? Wow, that sounds really boring, do you like doing that, or do you endure it?
You’ll discover all sorts of unexpected things when unpacking, like how firefighters mostly don’t fight fires, or how Twitch streamers don’t just “play video games”; they play video games for 12 hours a day. But you’re not just unpacking the job; you’re also unpacking yourself. Do any aspects of this job resemble things you’ve done before, and did you like doing those things? Not “Did you like being known as a person who does those things?” or “Do you like having done those things?” but when you were actually doing them, did you want to stop, or did you want to continue? These questions sound so stupid that it’s no wonder no one asks them, and yet, somehow, the answers often surprise us.
That’s certainly true for me, anyway. I never unpacked any job I ever had before I had it. I would just show up on the first day and discover what I had gotten myself into, as if the content of a job was simply unknowable before I started doing it, a sort of “we have to pass the bill to find out what’s in it” kind of situation. That’s how I spent the summer of 2014 as a counselor at a camp for 17-year-olds, even though I could have easily known that job would require activities that I hated, like being around 17-year-olds. Could I have known specifically that my job would include such tasks as “escorting kids across campus because otherwise they’ll flee into the woods” or “trying to figure out whether anyone brought booze to the dance by surreptitiously sniffing kids’ breath?” No. But had I unpacked even a little bit, I would have picked a different way to spend my summer, like selling booze to kids outside the dance.
It’s no wonder that everyone struggles to figure what to do with their lives: we have not developed the cultural technology to deal with this problem because we never had to. We didn’t exactly evolve in an ancestral environment with a lot of career opportunities. And then, once we invented agriculture, almost everyone was a farmer the next 10,000 years. “What should I do with my life?” is really a post-1850 problem, which means, in the big scheme of things, we haven’t had any time to work on it.
The beginning of that work is, I believe, unpacking. As you slice open the boxes and dump out the components of your possible futures, I hope you find the job that’s crazy in the same way that you are crazy. And then I hope you go for it! Shoot for the stars! Even if you miss, you’ll still land on one of the three moons.
You can think of unpacking as the opposite of attribute substitution; see How to Be Wrong and Feel Bad.
In my defense, this was a decade ago, closer to the days when you could become world famous by doing a few different dances in a row.
There is also a “false uniqueness effect”, but it seems to show up more rarely, on traits where people are motivated to be better than others, or when people have biased information about themselves. So people who like Hawaiian pizza probably think their opinion is more common than it is (false consensus). But if you pride yourself on the quality of your homemade Hawaiian pizza, you probably also overestimate your pizza-making skills (false uniqueness).
I’m pretty sure every campus politician was like this. During one election cycle, the pro-Palestine and pro-Israel groups started competing petitions to remove/keep a brand of hummus in the dining hall that allegedly had ties to the IDF. One of the guys running for class rep signed both petitions. When someone called him out, his response was something like, “I’m just glad we’re having dialogue.” Anyway, he won the election.
A few years later, a sophomore ran for student body president on a parody campaign, promising waffle fries and “bike reform.” He won a plurality of votes in the general election, but lost in the runoff, though he did get a write-up in the New York Times. Now he’s a doctor.
Top-tier insanity can sometimes make up for mid-tier talent. I’ve been in five-ish different improv communities, and in every single one there was someone who was pretty successful despite not being very good at improv. These folks were willing to mortgage the rest of their life to support their comedy habit—they’d half-ass their jobs, skip class, ignore their partners and kids, and in return they could show up for every audition, every gig, every side project. Their laser focus on their dumb art didn’t make them great, but it did make them available. Everybody knew them because they were always around, and so when one of your cast mates dropped out at the last second and you needed someone to fill in, you’d go, “We can always call Eric.” If you’ve ever seen someone on Saturday Night Live who isn’t very funny and wondered to yourself, “How did they get there?”, maybe that’s how.
2025-06-10 20:11:22
It’s cool to run big, complicated science experiments, but it’s also a pain in the butt. So here’s a challenge I set for myself: what’s the lowest-effort study I could run that would still teach me something? Specifically, these studies should:
Take less than 3 hours
Cost less than $20
Show me something I didn’t already know
Be a “hoot”
I call these Dumb Studies, because they’re dumb. Here are three of them.
(You can find all the data and code here.)
I’m bad at tasting things. I once found a store-bought tiramisu at the back of the fridge and was like “Ooh, tiramisu!” Then I ate some and was like, “Huh this tiramisu is kinda tangy,” and when my wife tasted it, she immediately spat it out and said, “That’s rancid.” We looked at the box and discovered the tiramisu expired several weeks ago. I would say this has permanently harmed my reputation within my family.
That experience left me wondering: just how bad are my taste buds? Like, in a blind test, would I even be able to tell different flavors apart? I know that sight influences taste, of course—there are all sorts of studies dunking on wine enthusiasts: they can’t match the description of a wine to the actual wine, they like cheaper wine better when they don’t know the price, and if you put some red food coloring in white wine, people think it’s red wine.1 But what if I’m even worse than that? What if, when I close my eyes, I literally can’t tell what’s in my mouth?
~~~MATERIALS~~~
My friend Ethan bought four kinds of baby food. They were all purees, so I couldn’t use texture as a clue, and I didn’t look at any of the containers beforehand.
~~~PROCEDURE~~~
I put on a blindfold and tasted a spoonful of each kind of baby food and tried to guess what it was.
~~~RESULTS~~~
Here’s how I did:
~~~DISCUSSION~~~
I would rate my performance as “humiliating”. Butternut squash and sweet potato are pretty similar, so I’ll give myself that one, but what kind of idiot tastes “pear” and thinks “lemon-lime”? I knew in the moment that there was probably no such thing as “lemon-lime” baby food (did Gerber’s acquire Sprite??), but that’s literally what it tasted like, so that’s what I said. Mixing up banana and strawberry was way below even my very low expectations for myself. When I took the blindfold off, people looked genuinely concerned.2
Here’s something interesting that happened: once my friends revealed the identity of each flavor, I immediately “tasted” it. It was like looking at one of those visual illusions that looks like a bunch of blobs and then someone tells you “it’s a parrot!” and suddenly the parrot jumps out at you and you can’t not see the parrot anymore. Except in this case, the parrot was banana-flavored.
My friends and I were hosting a party and we thought it would be funny to ask people to stick their hands in various buckets, just to see how long they would do it. We didn’t exactly have a theory behind this. We just thought something weird might happen.3
~~~MATERIALS~~~
We got two buckets, filled one with ice water, and filled the other bucket with nothing.
~~~PROCEDURE~~~
We flipped a coin to determine which bucket each partygoer (N = 23) would encounter first. (The buckets were in separate rooms, so they didn’t know which one was coming next.) Upon entering each room, we told the participant, “Please put your hand in the bucket for as long as you want.” Then we timed how long they kept their hands in each bucket.
~~~RESULTS~~~
On average, people kept their hands in the ice bucket for 49.26 seconds, and they kept their hands in the empty bucket for 31.57 seconds. The difference between these two averages was not statistically significant.4
But averages aren’t very revealing here, because people differed a lot. Here’s another way of looking at the same data. Each participant has their own row and two dots: a red dot for how long they spent in the empty bucket, and blue dot for how long they spent in the ice bucket. For privacy, all participants’ names have been replaced with the names of Muppets.
~~~DISCUSSION~~~
We learned two things from this study.
People are weird
Putting your hand in a bucket of ice is supposed to be a universally negative experience. It’s known in the science biz as the “cold pressor task,” and they use it to study pain because it hurts so bad. But some people liked it! One guy thanked the experimenter for the opportunity to make his hand really cold, which he enjoyed very much. Another revealed this: you know those drink chillers at the grocery store where you can put a bottle of white wine or a six pack in a vat of icy water and it swirls around and it chills your drink really fast? He used to stick his hand in one of those for fun.
Feeling pointless might hurt worse than feeling pain.
Say what you will about sticking your hand in an ice bucket: it’s something to do. You feel like you’re testing your mettle, your skin changes colors, your fingers tingle and that’s kinda fun. When you put your hand in an empty bucket, nothing happens. You just stand there like an idiot with your hand in a bucket. People think physical pain is inherently negative, like it’s pure badness. But when you lock eyes with Miss Pain Piggy and she holds her hand in ice water for 466 seconds straight, you start to question a lot of assumptions.5
Here’s something that’s always bugged me: people love sugar and salt, right? I mean, duh, of course they do. So why doesn’t anyone pour themselves a big bowl of salt and sugar and chow down? Is it just social norms and willpower preventing us from indulging our true desires? Or is it because pure sugar and salt don’t actually taste that good? Could it be that our relationship with these delicious rocks is, in fact, far more nuanced than simply wanting as much of them as possible?
This study was partly inspired by cybernetic psychology, which posits that the mind is full of control systems that try to keep various life-necessities at the right level. Sugar and salt are both necessary for life, and people certainly do seem to desire both of them. And yet, if you eat too much of them, you die. That sounds like a job for a control system—maybe there’s some kind of body-brain feedback loop trying to keep salt and sugar at the appropriate level, not too high and not too low. One way to investigate a control system is just to put stuff in front of someone and see what they do. That sounded pretty dumb, so that’s what I did.
~~~MATERIALS~~~
I got a measuring spoon marked “1/4 teaspoon” and some salt and sugar.
~~~PROCEDURE~~~
I ran this study at an Experimental History meetup in Cambridge, MA6. I brought people (N = 23)7 to a testing room and sat them at a desk. I first showed them the measuring spoon and asked them, “If I were to give you this amount of sugar, is that something you would like to eat?” If they said “Yes”, I poured 1/4 teaspoon of sugar into a tiny cup and gave it to them. Once they ate it, I asked them to rate the experience from 1 (not good at all) to 5 (very good), and I asked them if they’d like to do it again. If they said yes again, I gave them another 1/4 teaspoon of sugar and got their rating. I repeated this process until they refused the sugar. (Nobody took more than two shots.) Then I repeated the process with 1/4 teaspoon of salt.8 I should have randomized the order, but in all the excitement, I forgot.
~~~RESULTS~~~
About half of the participants flat-out refused the sugar, and two-thirds refused the salt. Anecdotally, many of the people who refused the sugar said something like, “oh, I’d like to eat it, but I shouldn’t.” People did not feel that way about the salt. They were just like, “No thanks”.9
The people who did try the sugar generally liked it. The people who tried the salt did not. (The latter were, by the way, all men.) A few of the guys put on a brave face after downing their salt, but the rest said things like “blech!” and “oh!”. Four people took an extra shot of sugar, and they liked it fine. The two people who took an extra salt shot gave the experience a big thumbs-down.
~~~DISCUSSION~~~
Why don’t people eat big spoonfuls of sugar and salt? For sugar, the answer might be “we think it’s sinful”. But it also might be because raw sugar isn’t actually super delicious. I’m a bit surprised that the sugar ratings weren’t even higher—isn’t sugar supposed to be pure bliss?10 For salt, the answer might just be “it tastes bad on its own”.
It’s weird that people had such strong reactions to such small amounts. There’s about 1g of sugar in 1/4 teaspoon, and a single Reese’s peanut butter cup—a notoriously delicious treat—contains 11x that much. Meanwhile, 1/4 teaspoon of salt is about 1.4g, and I happily ate more than that in a single sitting yesterday via a pile of tater tots dunked in ketchup. For some reason, people seem to find these minerals far more appealing when they’re mixed with other stuff.
Why would that be? Maybe it has to do with how much you need to survive, and how much you can eat before you die.
The estimated lethal dose of salt is 4g per kilogram of body weight, and people really do die from ingesting too much of it. In one case, a Japanese woman had a fight with her husband, drank a liter of shoyu sauce containing an estimated 160g of sodium, and died the next day. In another case, a psychiatric hospital forced a 69-year-old man to drink water containing 216g of salt (they wanted him to throw up because he had ingested his roommate’s medication); he was declared brain dead 36 hours later.
Meanwhile, the estimated lethal dose of sugar is much higher: 30g per kilogram of body weight. An extremely trustworthy-seeming Buzzfeed article called “It’s Actually Pretty Hard to Eat So Much Sugar that You Die” estimates that the average adult would need to eat 680 Hershey’s Kisses, 408 Twix Minis, or 1,360 pieces of candy corn before they kicked the bucket.
It takes a lot longer to eat several pounds of candy than it does to chug a liter of shoyu, so it’s easier for the body to defend against a sugar overdose than a salt overdose (by making you feel nauseous, cramping, throwing up, etc.). The best way to avoid death by salt, then, is to avoid eating large doses of salt in the first place, and the best way to do that is to make it taste bad. Maybe that’s why the same amount of salt tastes nasty when served raw and tastes delicious when sprinkled over a basket of french fries or dissolved in a bowl of soup—you’re only getting a little bit at a time, so you won’t shoot past your target level.11
Anyway, these results suggest that we do not “love” sugar and salt. We love a certain amount of sugar and salt, consumed at a certain rate, and perhaps even in a certain ratio to other nutrients. The results also suggest that coming to an Experimental History meetup is a super fun and cool time.
I’m showing you Dumb Studies as if they’re something new. But they’re not. At the beginning of science, all studies were Dumb.
Robert Boyle, the father of chemistry, did a thorough investigation of a piece of veal that was weirdly shiny (results inconclusive). Antonie van Leeuwenhoek, the father of microbiology, blew smoke at worms to see if it would kill them (it didn’t). Robert Hooke, the father of a bunch of stuff12, sprinkled some flour on a glass plate and then ran a bow along the side like he was playing the fiddle and was like “ooh look at the lines the vibration makes”. These studies looked stupid even then, and people duly ridiculed them for it.
Ever since then, the most groundbreaking scientists have always spent a big chunk of their time—perhaps most of their time—goofing around. Francis Galton, the guy who invented like 10% of modern science13, took a secret whistle to the zoo and whistled at all the animals (the lions hated it). Barbara McClintock learned how to control her perception of pain so she wouldn’t need anesthesia during dental procedures. Richard Feynman did about a million Dumb Studies, including a demonstration that urination isn’t driven by gravity because you can pee standing on your head. The neurologist V.S. Ramachandran was able to temporarily turn off amputees’ phantom limbs by squirting water in their ears and making them look at mirrors. They all had what I call experimenter’s urge: the desire to, quite literally, fuck around and find out.
After science became a profession, we started expecting our science to look very science-y, no Dumb Studies allowed. On top of that, the replication crisis left us all with a cop mentality that treats anything fun as suspicious. People want to blame the slowdown of scientific progress on the “burden of knowledge”14 or “ideas getting harder to find”—I disagree, and will fight such people, but I do agree that we're suffering under a modern burden: the burden of respectability.15
There’s a time and a place for the Serious Study. Sometimes you’re spending millions of dollars, for instance, and you can’t afford to be loosey-goosey with the procedure. But reality is very weird, and if you ever want to understand it, you have to bump into it over and over, in as many places and from as many angles as possible. You need the freedom to be Dumb. You must inspect the shining meat, you must pee standing on your head, and you must, I submit, eat this baby food.
I thought this was common knowledge, but apparently it’s not. In this New Yorker article about blind wine taste-testing, a professor of “viticulture and enology” confidently states that no one would ever mix up red wine and white wine right before the author does exactly that.
Come to think of it, why is “strawberry-banana” such a common flavor? Where did that come from?
An initial writeup of this experiment was previously published in the first, and so far only, issue of The Loop.
Specifically, t(22) = 0.69, p = .50. The more statistically-minded folks might be wondering: “Did you have enough power to detect an effect here? You only had 23 participants, after all.” Great question! With N = 23, we have about an 80% chance to detect an effect of d = .6 with a two-tailed paired t-test. That’s roughly what we consider a “medium” effect, based on something one statistician said once. To put that in context, the standardized effect of SSRIs on depression is .4, the effect of ibuprofen on arthritis pain is .42, and the effect of “women being more empathetic than men” is .9. The Bayes factor for this difference is .27, meaning moderately strong evidence for the null, according to something another statistician said once. So we can’t say there’s no difference between the empty bucket and the ice bucket, but if there is any difference, we can be pretty confident that it isn’t large.
Maybe this is also why, if you leave people alone in an empty room with a shock machine, they will voluntarily shock themselves.
This meetup was co-hosted with The Browser, a terrific newsletter that curates interesting internet stuff.
Not a typo; somehow every party I host ends up with 23 people at it. (Some people were there both times, but most weren’t.)
The party took place in the afternoon and had both salty and sweet snacks available, so each person was coming in with a different amount of sugar and salt already in their system.
This was one of the fun parts of running Dumb Studies: you let people do interesting things. In a Serious Study in psychology, for instance, we don’t usually let people say “no”. I mean, we do, obviously, for ethical reasons, but if they refuse some part of the study, then the study is over.
When you run a Dumb Study, you can treat all behavior as data. If someone doesn’t want to put their hand in the empty bucket or they don’t want to eat the salt, that’s not noise. That’s a result.
I don’t quite have enough data to tell, maybe there’s more variance in sugar preferences than there is in salt preferences. One participant remarked that he had eaten pure sugar earlier that day. And “salt tooth”, while apparently a thing, is far less common than “sweet tooth”, and it sounds like a D-list pirate name.
This would also predict that Gatorade tastes better after a run on a hot day; you’ve sweated out some of your sugar and salt stores, so your taste buds give you a thumbs-up for re-up.
For instance, Hooke’s law, Hooke’s joint, Hooke’s instrument, and Hooke’s wheel.
To be clear, both good stuff and bad stuff
Supposedly the “last man who knew everything” was English polymath Thomas Young, who died in 1829.
Weirdly enough, “being respectable” does not include “posting your data and code”, which most studies do not do.