Hallucination (AI)
A phenomenon in which an artificial intelligence system generates output that is factually incorrect, fabricated, or unsupported by its input, while presenting it with apparent confidence.
Hallucination in artificial intelligence refers to the generation of output—text, images, audio, or code—that is factually incorrect, internally inconsistent, or entirely fabricated, yet is produced with apparent confidence and fluency. The term is borrowed by analogy from psychology, where hallucinations are perceptions without a corresponding external stimulus; in AI systems, the analogue is output without a corresponding grounding in fact or evidence.
Hallucination has emerged as one of the central reliability challenges for large language models (LLMs) and is a primary focus of AI safety and alignment research. While earlier AI systems could be evaluated against well-defined ground truth in narrow tasks, LLMs operate over open-ended natural language and are expected to be accurate across an unbounded range of topics, making hallucination both more prevalent and harder to detect.
Causes
At the deepest level, hallucination is a consequence of how language models are trained. LLMs are probabilistic text generators: they learn to predict the most statistically plausible continuation of a text sequence given their training corpus. This optimisation objective rewards fluency and coherence but does not directly reward factual accuracy.[^1] When a model is asked about a topic that is underrepresented, ambiguous, or absent from its training data, it tends to generate a plausible-sounding response rather than expressing uncertainty.
Additional contributing factors include knowledge cutoffs—models have no awareness of events after their training data ends—and context window limitations that prevent full retrieval of all relevant facts during inference. Training objectives such as standard cross-entropy loss do not explicitly penalise confident errors, creating a systematic incentive to guess rather than abstain.[^2]
Types of Hallucination
Hallucinations manifest in several distinct ways. Factual hallucinations are incorrect statements about the real world: wrong dates, misattributed quotations, or erroneous statistics. Fabricated citations involve the invention of plausible-sounding paper titles, author names, and journal references that do not exist—a particularly insidious pattern that has caused real harm in legal and academic contexts. Intrinsic hallucinations are contradictions between the model's output and information explicitly provided in the prompt. Extrinsic hallucinations are claims that cannot be verified against any supplied source and are simply invented.[^3]
In multimodal models, visual hallucinations occur when a model incorrectly describes objects, text, or relationships present in an image, or asserts the presence of elements that are absent.
Detection
Detecting hallucinations is technically challenging because the same model that produces the error is often used to evaluate its own output. Independent approaches include cross-model validation, where outputs from multiple distinct systems are compared; retrieval-based verification, where claims are checked against a curated knowledge base; and uncertainty quantification, where calibration techniques assess how confident a model should be about a given claim.[^4]
Research methods such as Cross-Layer Attention Probing (CLAP) train lightweight classifiers on a model's internal activations to flag likely hallucinations before output is returned to the user. The MetaQA framework uses metamorphic prompt mutations—small semantic-preserving changes to a question—to detect inconsistency patterns that indicate hallucination even in closed-source models where internal activations are inaccessible.[^2]
Mitigation Strategies
Retrieval-augmented generation (RAG) is currently the most widely deployed mitigation: the model is constrained to synthesise its answer from documents retrieved from a trusted knowledge base, grounding its output in verifiable sources and reducing the need to rely on memorised facts.
Calibration-aware training adjusts the reward signal to penalise overconfident incorrect answers, encouraging models to hedge appropriately and express "I don't know" when uncertain. Combined with RLHF (reinforcement learning from human feedback), this can substantially reduce hallucination rates on measurable benchmarks. A 2025 multi-model study found that prompt-based mitigation strategies reduced GPT-4o's hallucination rate from 53% to 23% on a standardised factual benchmark.[^2]
At the deployment level, human-in-the-loop processes—where AI outputs are reviewed by domain experts before being acted upon—remain the most reliable safeguard in high-stakes settings. By 2025, 76% of enterprises reported incorporating human review processes specifically to catch hallucinations before deployment.[^5]
Industry and Research Context
The problem is not expected to disappear as models scale. While larger models generally hallucinate less on well-represented topics, they can hallucinate more confidently, making detection harder. The 2025 research consensus favours building systems that signal uncertainty transparently rather than pursuing a zero-error target, which is considered unrealistic for open-domain language generation.
See Also
References
References
- Ji, Z., Lee, N., Frieske, R., et al. (2023). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55(12), 1–38.
- Lakera. (2025). LLM Hallucinations in 2026: How to Understand and Tackle AI's Most Persistent Quirk. https://www.lakera.ai/blog/guide-to-hallucinations-in-large-language-models
- Wikipedia. (2025). Hallucination (artificial intelligence). https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)
- Deepchecks. (2024). LLM Hallucination Detection and Mitigation: Best Techniques. https://deepchecks.com/llm-hallucination-detection-and-mitigation-best-techniques/
- Infomineo. (2025). Stop AI Hallucinations: Detection, Prevention & Verification Guide 2025. https://infomineo.com/artificial-intelligence/stop-ai-hallucinations-detection-prevention-verification-guide-2025/