AI hallucination - Definition

An AI hallucination is any output a language model presents confidently as fact but that is false, fabricated, or unsupported by its sources. The model isn't lying - it's doing what it always does, predicting plausible text - but the result is a statement that reads as authoritative while having no grounding in reality.

Why it happens

Language models generate text by predicting the most likely next token, not by looking up facts. When the relevant information isn't in the model's context (the text it can actually see while answering), it fills the gap with whatever is statistically plausible. That's how you get invented citations, made-up statistics, and confidently wrong summaries.

How document AI prevents it

In document AI - reading and answering questions about PDFs, wikis, and the web - hallucination is largely solvable, because the source material exists and can be put in front of the model:

Retrieval-augmented generation retrieves the relevant passages first, so the model answers from real text rather than memory.
Source-grounding constrains the answer to those retrieved passages and attaches a citation to each claim.
Citation checking verifies, after the fact, that each claim is actually supported by a retrieved passage - and drops the ones that aren't.

Sidenote uses all three: every answer is grounded in passages retrieved from the document you're reading, every claim carries a citation that scrolls to the source, and any claim that can't be matched to a passage has its citation dropped before you see it.

FAQ

Why do AI models hallucinate?

Because they generate text by predicting what plausibly comes next, not by looking facts up. When the true answer isn't in the model's context, the most statistically likely continuation is often a confident guess, and the model has no internal signal separating recall from invention.

Can AI hallucinations be prevented completely?

Not in open-ended generation, but in document Q&A they can be made rare and, more importantly, visible. Grounding answers in retrieved passages, citing each claim, and dropping claims no passage supports means an error has to survive three checks before it reaches you, and clicking the citation exposes it if it does.

How can I tell if an AI answer is hallucinated?

You can't tell from the prose; hallucinated claims read exactly like correct ones. The only reliable test is checking the claim against a source, which is why AI summaries need citation-checking rather than trust.

Why it happens

How document AI prevents it

FAQ

Why do AI models hallucinate?

Can AI hallucinations be prevented completely?

How can I tell if an AI answer is hallucinated?

Stop digging. Start asking.