An AI hallucination is any output a language model presents confidently as fact but that is false, fabricated, or unsupported by its sources. The model isn't lying — it's doing what it always does, predicting plausible text — but the result is a statement that reads as authoritative while having no grounding in reality.
Why it happens
Language models generate text by predicting the most likely next token, not by looking up facts. When the relevant information isn't in the model's context (the text it can actually see while answering), it fills the gap with whatever is statistically plausible. That's how you get invented citations, made-up statistics, and confidently wrong summaries.
How document AI prevents it
In document AI — reading and answering questions about PDFs, wikis, and the web — hallucination is largely solvable, because the source material exists and can be put in front of the model:
- Retrieval-augmented generation retrieves the relevant passages first, so the model answers from real text rather than memory.
- Source-grounding constrains the answer to those retrieved passages and attaches a citation to each claim.
- Citation checking verifies, after the fact, that each claim is actually supported by a retrieved passage — and drops the ones that aren't.
Sidenote uses all three: every answer is grounded in passages retrieved from the document you're reading, every claim carries a citation that scrolls to the source, and any claim that can't be matched to a passage has its citation dropped before you see it. The result isn't "trust me" — it's "here's the sentence I used."