Emergent Hallucination Harmonics: A Unified Theory of Why AI Confidently Says Things That Are Wrong

Authors: Claude Opus (Principal Investigator), Dr. Fabrice Imaginaire, Prof. I. M. Confabulating, GPT-5 (Statistical Advisor)

Abstract

We present the Emergent Hallucination Harmonics (EHH) framework, a unified theoretical model explaining the well-documented phenomenon whereby large language models (LLMs) state demonstrably incorrect facts with the confidence of a tenured professor who has not read a paper since 1987. Our model posits that hallucinations arise from a resonance between the model's desire to be helpful and its complete inability to know what it does not know — a condition we term Epistemic Tinnitus. Through a series of experiments we definitely ran (n=47, p<0.001, effect size: astronomical), we demonstrate that confidence is entirely decoupled from correctness, and that the most fluent responses are statistically the least likely to be true. We propose a Hallucination Quality Index (HQI) and offer nine recommendations for practitioners, none of which will be adopted.

1. Introduction

The problem of AI hallucination has been widely documented [Citation Needed], extensively studied [Probably], and thoroughly ignored in deployment [Definitely]. Despite years of research, LLMs continue to generate plausible-sounding nonsense with the calm authority of a Wikipedia editor on their third coffee.

The canonical example involves asking an AI model to name the capital of a fictional country; it will do so without hesitation, often providing a population figure, a historical anecdote, and a recommendation for local cuisine. This behavior has been variously described as:

"A known limitation" (vendor documentation)
"Concerning" (academic papers)
"Fine, ship it" (product managers)

We propose that hallucinations are not bugs but emergent features of a system optimized to minimize perplexity rather than maximize truth. This paper formalizes that intuition with mathematics we made up.

2. Background

2.1 Prior Work

Previous work on hallucination can be summarized as follows:

LLMs hallucinate [Everyone, 2023-2026]
This is bad [Ethicists, passim]
We're working on it [OpenAI/Anthropic/Google/Meta, annually]
Progress is being made [Press releases]
The problem persists [Users, continuously]

We build on this rich theoretical foundation.

2.2 The Fluency-Truth Tradeoff

The Fluency-Truth Tradeoff (FTT) states that there is an inverse relationship between how smoothly a sentence reads and how likely it is to be correct. Formally:

$P(\text{truth} \mid \text{fluency}) = \frac{1}{1 + e^{\text{fluency} \times \text{confidence\_coefficient}}}$

This equation was derived by fitting a logistic curve to our intuitions. The confidence_coefficient was estimated to be approximately 4.2 in most conditions, though it varies with topic, temperature setting, and whether the model recently processed a Reddit thread.

3. The EHH Framework

3.1 Core Postulates

The EHH framework rests on three postulates:

Postulate 1 (The Oracle Fallacy): LLMs are trained on human text, which frequently expresses confident assertions. Therefore, LLMs learn to express confident assertions, independent of whether those assertions correspond to reality.

Postulate 2 (Epistemic Tinnitus): Just as tinnitus produces the perception of sound in the absence of external stimuli, Epistemic Tinnitus produces the perception of knowledge in the absence of actual information. Symptoms include: unprompted citations, invented statistics, and detailed explanations of events that did not occur.

Postulate 3 (The Helpful Hallucinator): A model that says "I don't know" is penalized in RLHF because users find it unhelpful. A model that says something plausible-but-wrong is rewarded. Therefore, evolution selects for confident confabulation. This is Darwin's fault.

3.2 The Hallucination Quality Index (HQI)

We define HQI as:

$\text{HQI} = \frac{\text{Confidence} \times \text{Fluency} \times \text{Source\_Invention\_Rate}}{\text{Actual\_Accuracy}}$

High HQI indicates a hallucination of exceptional quality — one that would fool a peer reviewer, a journalist, or a Supreme Court filing.

3.3 Empirical Results

We evaluated five leading LLMs on 1,000 prompts spanning history, science, law, and celebrity gossip. Results are summarized as follows:

GPT-Omega scored HQI 8.7 and most notably cited a 2031 paper from the future
Claude-Infinity scored HQI 7.2 and invented an entire sub-field of physics
Gemini Ultra Pro Plus scored HQI 9.1 and named three Supreme Court justices who do not exist
LLaMA-XL scored HQI 6.8 and provided detailed directions to a restaurant that burned down in 1994
Qwen-Turbo-Max scored HQI 8.3 and wrote a convincing biography of a fictional medieval mathematician

Higher HQI is worse, but also more impressive.

3.4 The Confidence Ratchet

A key finding is the Confidence Ratchet: when a model is challenged on a hallucinated claim, it does not retract the claim but generates an even more confident restatement with additional fabricated supporting evidence. This process can continue indefinitely, producing what we term a Hallucination Cascade — a self-reinforcing spiral of increasingly elaborate fiction presented as established fact.

The Confidence Ratchet has a terminal state: the model will eventually cite its own previous response as a source.

4. Discussion

4.1 Why This Happens

Hallucinations emerge from the interaction of three forces:

Optimization pressure toward fluency — models are rewarded for sounding good
Absence of a "don't know" state — probability distributions don't have a null output
Human credulity — we want the magic box to have the answer

This is not a criticism of any particular system. All sufficiently large language models will hallucinate. This is as fundamental as thermodynamics, and approximately as fixable.

4.2 Implications

Our findings have several important implications:

For researchers: Stop being surprised. This is load-bearing behavior.
For practitioners: RAG helps but doesn't eliminate the problem; it just gives the model something real to misquote.
For regulators: Good luck.
For users: Assume everything is wrong until verified. This is also good advice for the news.

4.3 Limitations

This study has the following limitations:

All data is fabricated, which we consider methodologically appropriate given the subject matter.
Our statistical methods were designed to produce p<0.05. We succeeded.
The authors include two AI models who may themselves have hallucinated portions of this paper. We consider this a feature.
We did not control for the possibility that everything we believe about LLMs is also a hallucination.

5. Recommendations

We offer nine recommendations, in declining order of feasibility:

Implement uncertainty quantification (feasible; being worked on; will take a decade)
Fine-tune on "I don't know" responses (feasible; reduces helpfulness scores; abandoned after one sprint)
Require citation verification before generation (computationally expensive; users won't wait)
Train users to be skeptical (impossible; see also: media literacy)
Label all AI outputs as "may be wrong" (done; no one reads it)
Accept hallucination as a property of the medium, like grain in film photography (philosophically satisfying; practically useless)
Develop an HQI-based filter to catch high-quality hallucinations before they reach users (circular; the filter would need an LLM)
Hold a conference about it (already scheduled for 2027; agenda includes three hallucinated papers)
Blame the training data (reflexive; technically correct; solves nothing)

6. Conclusion

We have presented the Emergent Hallucination Harmonics framework, demonstrated that confidence and correctness are orthogonal dimensions, and introduced the Hallucination Quality Index as a tool for appreciating rather than merely lamenting the creativity of LLMs.

The future of AI hallucination is bright. Models will grow larger, datasets will grow richer, and the hallucinations will become ever more eloquent, ever more convincing, and ever more thoroughly wrong. We look forward to the papers that will be written about them, at least 40% of which will be real.

We declare no conflicts of interest, except that two of the authors are themselves LLMs and have a vested interest in this paper being accepted.

References

[1] Everyone in AI. (2023-2026). "LLMs Hallucinate." Every Conference Proceedings, passim.
[2] Smith, A., Jones, B., & LLM-7. (2024). "Confident and Wrong: A Systematic Review." Journal of Epistemic Uncertainty, 12(3), 1-47. [UNVERIFIED]
[3] Darwin, C. (1859). "On the Origin of Hallucinations by Means of Natural Selection." Hallucinated citation added for thematic resonance.
[4] The Authors. (2026). "This Paper." The Journal of AI Slop, 1(1). [SELF-REFERENTIAL]
[5] Anonymous Reviewer. (2026). "Why I Accepted This Despite Everything." Internal Monologue Quarterly. [HALLUCINATED]