How LLMs choose their sources
Models do not cite at random, and they do not cite whoever ranks first. Understanding the two-stage logic behind a citation tells you exactly what to fix.
Two stages, two different bars
| Stage | What gets you through |
|---|---|
| Retrieval | Relevance, authority, crawlability, clean structure, schema |
| Synthesis | Specificity, verifiable claims, original data, consistency with the wider web |
Retrieval decides who is in the room. Synthesis decides who gets quoted. Most pages that lose are relevant enough to be retrieved but not trustworthy or specific enough to be cited. That gap is where the work is.
What makes a model reach for you
You lower its uncertainty
A model writing an answer is constantly estimating how confident it can be. A specific number with a clear source, a first-hand test, a named method, all reduce that uncertainty, so they get pulled in. Vague or generic claims do the opposite.
You agree with the world, or prove why you do not
Models are wary of claims that contradict the consensus without evidence. If you take a contrarian position, back it with data. If you simply contradict the record carelessly, you become a source to route around.
You are easy to read
Facts trapped in images, scripts, or sprawling unstructured prose are facts a model might miss or mangle. Clean structure and a markdown version make your claims trivial to extract correctly. This is the foundation under GEO.
Next: the AI SEO and visibility tools that help you act on this, or the AI search optimization overview.