Why AI hallucination happens

Large language models generate text by predicting, token by token, what text is most statistically likely to follow a given prompt. They are trained on vast amounts of text and learn the statistical patterns of how language is used, how facts are typically stated, how arguments are structured, and what follows from what in a given domain. When they generate an answer to a question, they are producing text that is statistically plausible given their training — not text that is verified against a database of facts.

The critical insight is that these models have no mechanism for distinguishing between things they "know" (patterns strongly represented in training data) and things they are "making up" (patterns extrapolated into territory not well-represented in training data). A model that has read thousands of legal cases will generate legal text that reads like authentic case law — including fabricated case citations that follow the correct citation format, reference real courts, and use authentic legal language. The model is generating statistically plausible legal text; it has no mechanism for verifying that the cases it cites actually exist.

The governance framework for hallucination risk

Classification by consequence: not all hallucination is equally consequential. An AI drafting a creative email that contains a minor factual error is different from an AI generating a regulatory submission that cites non-existent regulations. The governance response should be proportionate to the consequence of the AI content being wrong. Verification protocols: for high-stakes AI content, define specific verification protocols. For legal AI: require that all case citations are verified against authoritative databases before use. For medical AI: require clinical review of all medical claims by qualified clinicians. For financial AI: require verification of all factual claims and data against primary sources. For regulatory AI: require verification of all regulatory references against the primary regulatory texts. Retrieval-augmented generation: RAG architectures, where the AI is constrained to generate content based on verified source documents provided in its context, significantly reduce hallucination risk by anchoring outputs to specified sources. RAG is not hallucination-proof — models can still misrepresent source content — but it reduces the rate of fabrication and makes verification easier. Human review as a control: for high-stakes AI content, human review by a qualified expert who can detect hallucination is the most reliable control. This review must be genuine — a reviewer who cannot detect hallucination in the domain (because they lack domain expertise or because the review is too cursory) does not provide effective control.