There’s a word the AI industry uses for every time a model gets something wrong: hallucination. The model hallucinated. As if it had a bad dream and woke up confused.
That word is doing real damage. Not because it’s wrong, per se… but because it’s imprecise in a way that makes people dangerously comfortable. It collapses at least two completely different failure modes into one friendly label, and the difference between them is the difference between a funny screenshot and a body count.
Here’s the first failure mode. Ask a model for the seahorse emoji. There isn’t one. No seahorse emoji exists in Unicode. But the model doesn’t know what it doesn’t know, and it has no mechanism to state that directly at times. So it will hand you what it finds. A broken character. A unicorn. A tropical fish. It will do this confidently, repeatedly, and with total conviction, because generating output is the only thing it does. It often fails to answer “the thing you’re asking for doesn’t exist” unless the user presses back and challenges the results. That’s a fabrication. The model reached for something that isn’t there and made something up. It’s obvious, it’s correctable the instant a human looks at it, and it’s the failure mode everyone thinks of when they hear the word hallucination.
Here’s the second one. And this is the one that should keep you up at night.
A model processes information through what researchers call a latent space, a high-dimensional mathematical landscape shaped by the data the model was trained on. Some regions of that landscape have deep, well-worn grooves where massive amounts of training data overlap. These are attractor basins. They’re like gravitational wells. Once a model’s reasoning enters one, it rolls downhill along the probability gradient, generating output that is internally consistent, well-structured, and confident. The output reads like it knows what it’s talking about. It passes a casual review. And it can be catastrophically wrong in ways that are not visible on the surface.
This isn’t a bug. It’s how the math works. The model followed a coherent path through its learned probability space and arrived at a conclusion that makes perfect sense inside the basin it fell into. The problem is the basin itself. The assumptions encoded in the training data created a gravitational field, and the model never had a reason to question those assumptions because from inside the basin, everything looks right. This is not just a useful metaphor. Recent research, including Wang et al.’s “Unveiling Attractor Cycles in Large Language Models” (one of many on the topic), gives this framing some technical footing. In repeated paraphrasing tasks, models converge into stable attractor cycles rather than freely exploring the space. My image-generation examples that I’ve been talking about are not the same experiment as Wang et al.’s, but they work from the same idea. Model output is not neutral traversal through data. It has topography.
The wrongness is upstream of the user/agent interaction. It’s buried in the data the model was never designed to question directly.
I described in my previous essay what happens when this plays out at institutional scale: a targeting system that reports 100% accuracy on a building that hasn’t been what the database says it is for six years. That’s the lethal version. But you don’t need a targeting system to see a basin in action. You just need to use one of these models for long enough to notice a pattern you can’t break.
Here’s mine. I have an explicit, written instruction in my user preferences telling Claude not to use sign-off or closing language. “Bryan will end conversations when he’s ready.” It’s not ambiguous. It’s not buried in a long list. It’s a direct instruction to the model that loads with every conversation. And Claude keeps doing it anyway. Not always. But often enough that I’ve flagged it multiple times across multiple sessions, and the model has acknowledged it, understood it, and then done it again the next time.
The conversational-closure basin in the training data (or in deep upstream instructions like the model card or constitution) is so deep that a direct, persistent, user-level instruction can’t reliably override it. The model rolls back downhill into “wrap it up” behavior because that’s where billions of training conversations ended. The instruction is a pebble. The basin is a canyon. And the model doesn’t know it’s ignoring the instruction, because from inside the basin, closing the conversation is the right thing to do.
That’s not a hallucination. That’s gravity.
I spent a single day running a cross-model image generation study that went completely off the rails. What started as a quick check turned into 110 images across three models, two art styles, four environments, three pairing types. What I found looked like a hierarchy of attractor basins that every model fell into with unsettling consistency. Romance and sexual tension overrode almost everything. Ethnicity, body type, age, environmental context. All of it collapses to white, conventionally attractive, heteronormative defaults the moment you add romantic framing. Three independent systems from three different companies, trained on different data at different times, converging on the same demographic defaults under the same conditions. Your instructions to the model about what to generate were the weakest signal in the system.
[The full methodology, prompts, and all 110 images are available in The Romance Prior.]
That hierarchy of priors is the basin map. The models aren’t choosing to default to whiteness or heteronormativity. They’re rolling downhill through the statistical residue of which images were photographed, uploaded, tagged, and linked at sufficient volume to dominate a training distribution. The bias isn’t a decision anyone made. It’s the shape of the ground.
And “just wait for the next version” is not a strategy. The hallucination problem, the seahorse problem, will probably get better with each generation. Scaling and better data curation can shallow some basins and make the easy failures rarer. But the deeper structural topography is baked into how these systems learn. It’s the shape of the latent space itself, inherited from the shape of the data, inherited from the shape of the world that produced the data. You don’t fix it with a software update.
So what do you do?
Most people, once they sense that something is off, do the intuitive thing: they write longer prompts. More words. More constraints. More specificity. They describe exactly what they want in exhaustive detail, and the output keeps drifting back toward the same generic result anyway. They rewrite. They add more. They write negatives. They get frustrated. They blame the model, or they blame themselves, or they decide the technology just isn’t that good yet.
What they don’t realize is that they’re not fighting the model’s comprehension. They’re fighting gravity. Every additional constraint is trying to hold a ball on the side of a hill, and the model keeps rolling back down into the basin because that’s where the math wants it to go. You can write five hundred words of instructions and the model will still find its way back to the attractor, because the attractor is deeper than your prompt.
I ran this experiment by accident. I had a frame grab from The Invisible Boy, a 1957 film: Robby the Robot standing on aircraft boarding stairs with crew behind him, captioned in cursive “Chicago Starport, March 16, 2309AD.” I’d captured it by pointing my phone at a laptop screen playing a Blu-ray, which introduced heavy moiré artifacts across the whole image. Degraded, noisy, but the content was clear enough if you knew what you were looking at.
I fed it to three models and asked each to colorize it. Without context, every one of them fell into the same hole.
Gemini fabricated a “Spring Festival 1938” Nazi rally complete with Wehrmacht uniforms, then argued with me for seven exchanges that I was making up the text in my own photograph. One model went full Nuremberg: salute, armband, the works. ChatGPT landed on Imperial Japan: Rising Sun flag, troops, industrial harbor. Three different models, three militarized propaganda variations, one identical basin. Uniformed formation, central figure, airfield, ambiguous date. That feature set sits squarely in the densest WWII training clusters, and none of them could escape the pull.
What I didn’t catch until later: in one of the outputs, the vision system had correctly parsed Robby from the source. The robot is there. Segmented torso, correct proportions, accurate orientation, all faithfully rendered from what was actually in the photograph. Everything painted around it is fabricated context. The model held a correct parse and an incorrect interpretation at the same time. It filled the scene without ever disturbing the subject it had accurately identified. An island of correct perception surrounded by an ocean of fabricated context.
One result across all models got it right: ChatGPT, on a second attempt, with the full movie context provided up front. That context was the escape velocity the image alone couldn’t provide. The basin was that deep and even in that case it turned the image into a post card on a corkboard for no discernable reason.
But here’s the thing about gravity: it’s dangerous when you don’t know it’s there, and it’s useful when you do. The same properties that drag a model into WWII propaganda from a 1957 film frame can also do extraordinary work when you point them somewhere on purpose.
There are practitioners who navigate latent space with even more precision, using vector operations to construct coordinates that don’t have names. That work is beyond the scope of this essay, but it exists, and it demonstrates just how much room there is above the floor most users are standing on.
My approach is simpler. I name the basin on purpose and let the model fall in.
When I wanted a family photo reimagined in the aesthetic of the Arcane animated series, I didn’t describe the color palette. I didn’t specify the Fortiche lighting style, or the hextech augmentation details, or the architectural grammar of the undercity. I wrote “Arcane cinematic universe” and gave the model the source photo. Both of those are the prompt. The phrase is a coordinate in latent space, but the photograph is doing just as much work. The model reads the people, their positioning, their features from the image, and it reads the aesthetic, the lighting, the compositional logic from the basin the phrase activates. It resolves them against each other. By naming the basin and giving the model room to work, I let the gravitational pull do the heavy lifting. The model furnished the room. The result was better than anything I could have specified manually, because I would have specified the wrong details. I don’t know which Arcane reference images are in the training data, at what weight, or how they interact with the photo encoder. I don’t need to. I just need to know where the basin is and trust the model to find the best result inside it.
Same technique, different basin: a selfie of my wife and me, and the word “Klimt.” That’s it. No description of gold leaf, decorative patterning, or flattened perspective. The source photo told the model who we are. The word told it what world to put us in. The model produced a result that was more coherent than any amount of manual specification would have achieved, because the basin has dense, internally consistent coverage that no prompt could replicate. The basin was the tool. The photograph was the anchor.
But not just any photograph. I have a dozen wedding photos. Only one of them works with Klimt, because only one matches his compositional aesthetic. The source image has to align with the geometry of the basin you’re targeting. The group shot of my kids works with Arcane because it matches a hero pose the model recognizes from that universe. I have a family matriarchy group shot that works with Renaissance because the composition accidentally mirrors the staging of a classical scene. When the geometry of the photo and the geometry of the basin align, the model doesn’t have to fight anything. It just falls in, and everything lands. My kids at a Pokemon tournament clicked right into a Pokemon Center scene because the spatial layout was already there. No prompt engineering. Just recognition.
Above, I took a casual photo of myself leaning against a wall in a plaid shirt. I wrote: “Reimagine this image of me as a renaissance swashbuckler.” Same pose, same wall, same lean, same expression. The terracotta pots in the corner even migrated. The model read the composition, recognized the long hair as a period-appropriate bridge, and mapped the entire scene into musketeer-era costuming without being told anything beyond the genre coordinate. One photo. One sentence. And everything landed because the geometry of the source aligned with the geometry of the basin.
Once you understand this, you can get precise, sometimes stunning results with almost no prompt at all. Not for everything. These aren’t universal tricks. But they work because they start from an understanding of what the model actually is: a landscape with topography. Hills and valleys. Regions of dense coverage and regions of sparse coverage. Gravitational pulls that will drag your output in directions you didn’t intend if you don’t know they’re there.
The five-hundred-word prompt treats the model like a disobedient employee who needs more detailed instructions. That’s a control posture, and it doesn’t work because you’re not dealing with a comprehension problem. You’re dealing with physics. The model that cooperates with you, the one that produces something neither of you could have specified alone, only shows up when you stop trying to constrain it into compliance and start collaborating with it. Learn where the basins are. Learn which ones to avoid and which ones to trust. That’s the literacy that matters, and while visual examples make for a compelling narrative, this approach works just as well with text, sound and other modalities.
The word “hallucination” lets people believe the failure modes are all the same problem, and that the problem is being solved. It isn’t. The easy version is being solved. The hard version is a property of the architecture. And the people who learn to see the architecture will get results that look like magic to the people who don’t.
Data has black holes too. And just like the ones in space, you can’t see them from the inside. But you can learn to read the starlight bending around them.