The Tell

In 2018 I wrote down that machines could already catch the writing of smarter machines. Then the tell vanished—and took an older faith with it.

In the spring of 2018, before I had any idea I would end up working anywhere near this, I wrote a single sentence in a notebook: we’re at a point where machines can distinguish the writing of more complex machines. It felt, at the time, like a small marvel worth recording—and like the beginning of something stable. The text generators of that moment were crude, and they gave themselves away, and what struck me was that you didn’t need a great intelligence to catch them. A modest classifier could do it. The simpler machine could reliably flag the work of the more complex one. There was something almost reassuring in that, an implied law underneath it: the artificial always betrays itself to a careful enough eye. We would build these things, and we would still be able to see through them.

I have kept the sentence, because it turned out to be wrong in the most instructive possible way, and I want to be precise about how.

Start with why the early machines were catchable, because the reason is the whole story in miniature. Their writing had a fingerprint, and the fingerprint was predictability. A language model, then as now, works by guessing the likely next word, and the early ones guessed the likely word too faithfully—they reached, again and again, for the safe continuation, the high-probability token, the unsurprising turn. Human writing is lumpier than that. We swerve, we choose the odd word, we are surprising in small involuntary ways that a probability machine, optimizing for fluency, smooths flat. So a detector never had to understand a word of the text it was judging. It only had to measure surprise—to ask how predictable each word was given the ones before it—and the suspicious lack of surprise gave the machine away. The tell was not the presence of intelligence. It was the absence of the small human noise. The detector found the seam where the artifice was simply too regular to be a person.

This is what I had seen, and it was real, and for a brief window it held. The trouble is that I read it as a permanent feature of the relationship between people and machines, when it was a temporary feature of a gap that was about to close.

The models scaled, and they crossed a threshold, and the fingerprint faded.

It did not fade gradually toward some hard floor of residual detectability. It mostly just went away. The text got various enough, surprising enough, lumpy enough in its statistics that the seam I described closed, and the detectors that had keyed on smoothness lost the thing they were measuring. And here is where it stops being a story about declining accuracy and becomes a story about something breaking, because the detectors did not fail quietly. They failed by accusing the innocent. They began flagging human writing as machine—confidently, at scale. People fed the United States Constitution into them and got back a verdict of likely AI. People fed in their own blog posts from years before the models existed and watched them get tagged as synthetic. And most damning of all, and least forgivable, the detectors turned out to flag the writing of non-native English speakers at catastrophic rates—because a person writing careful, slightly unidiomatic, second-language English produces exactly the kind of even, hedged, low-surprise prose that the detector had learned to read as a machine. The tool built to catch the artificial reserved its harshest false confidence for the humans who wrote a little too carefully. Universities that had rushed to install these things began switching them off. OpenAI, of all institutions, built a classifier to detect AI-generated text and then quietly withdrew it, conceding it did not work.

And the arms race underneath it all turned out to be structurally rigged in one direction. Any feature a detector learns to key on can be trained or smoothed away, because the generator can always be pointed at the detector and taught to slip past it. You barely even need that much. Run a piece of machine text through a second model and ask it to rephrase—recursive paraphrasing, the researchers call it—and detection collapses from catching most of the text to catching almost none of it. There is no fixed signature to defend, because the thing producing the signature can be asked to produce a different one. The simpler machine catching the more complex machine, the small reassuring law I thought I’d glimpsed, was an artifact of a single early moment when the gap between them was wide enough to see across. The gap closed. The law was never a law. It was a sunset I mistook for a fixture of the sky.

So I had the local fact right and the deep assumption wrong, and it is worth being clear about which assumption, because it is older and more comfortable than anything to do with language models, and it is the thing that actually died.

The belief underneath my 2018 sentence was that the artificial always betrays itself to scrutiny—that the fake has a seam, that with sufficient attention the eye can always find the tell. We lean on this faith almost everywhere without naming it. It is in I’d be able to tell. It is in the whole posture of vigilance as the defense against being deceived, the conviction that careful looking is, in the end, protection. And it is precisely this that the fluent machine retired. Past a certain threshold of competence, the artifice stops leaving a seam. There is no tell. The strategy of looking closer, which has guarded us against counterfeits since there have been counterfeits, has quietly stopped working—and not only for text. The sentence no one wrote and the face that does not exist are the same event in two different media. In both, the fake crossed the line where it stopped betraying itself, and in both the casualty was not the truth, exactly, but something more intimate: the comfortable belief that truth was inspectable, that the real announced itself to a careful eye, that you could tell the genuine from the manufactured by attending hard enough to the thing in front of you. You can’t. Not from the surface. Not anymore.

What is left, once you accept that, is not better detection. The honest state of the art is that you cannot reliably tell, and most of the industry built on promising otherwise is selling false confidence with real victims attached—the flagged student, the doubted writer, the candidate misjudged on a probability score that means almost nothing. What replaces detection, to the extent anything can, is provenance—the attempt to move the question from “can I tell by looking?”, where the answer is now no, to “can I trace where this came from?”, where the answer is sometimes. Watermarking is the serious version of this: a faint statistical signal stamped into the text at the moment of generation, so that the maker, or anyone the maker trusts with the key, can later confirm its origin. It is a genuine advance and it is also badly limited—it only works if the generator chose to apply it, it can be washed out by the same paraphrasing that defeats the classifiers, and verifying it usually requires access to the very system that produced the text. There is a live and unsettled argument among the researchers about whether reliable detection is merely very hard or something closer to fundamentally impossible. But the practical verdict is already in, and it is that the seam is not coming back. We are not going to look our way out of this.

And beneath provenance lies the same residue I keep arriving at from the other direction, the one the generated faces forced too. When the surface stops carrying the difference, the real has to be located somewhere a machine cannot reach—not in any feature of the text, which can be copied, but in the fact of its origin: that there was, or was not, an actual someone behind it, accountable, singular, who meant it. The difference between the human sentence and the machine sentence is no longer in the sentence. It never fully was. It is in whether anyone is answerable for it, and that is exactly the thing no amount of looking at the words will ever again tell you.

I keep the 2018 note because it is a fossil, and fossils are precise about the world that made them. It records, with eight years of hindsight now pressing down on it, the last moment at which it was still possible to believe the machine would always give itself away—that we had built something we could, with effort, continue to see through. The sentence is not wrong. It is dated, in the exact way the man with the lantern was early: it caught a truth standing right at the threshold and did not yet know which way the threshold would tip. It tipped. The machine learned to write without a tell, and what it took from us on the way out was not our ability to write, which we still have, but our ability to be sure, just by looking, that a person did. The eye is no longer a witness. We are going to have to find another way to know who is there—and we are going to have to do it in a world that has only just stopped admitting the old way ever quietly failed.