← Latest · Archive

SEV-3OpenAI
2 sources standard

OpenAI published an explainer on 5 September 2025 addressing why its language models produce hallucinations, acknowledging the persistent issue affects outputs across its model family [source].

The post describes hallucinations as instances where models generate plausible-sounding but factually incorrect or unsupported information. OpenAI attributes the behaviour to several factors: models predict statistically likely continuations rather than retrieving verified facts, training data contains inaccuracies the models may reproduce, and the architecture lacks mechanisms to distinguish between memorised knowledge and probabilistic guessing.

OpenAI states it has implemented mitigation strategies including reinforcement learning from human feedback, instruction tuning to encourage citations, and retrieval-augmented generation in some products. The post notes these reduce but do not eliminate hallucinations, particularly in domains requiring specialised knowledge or recent information beyond training cutoffs.

The acknowledgement follows documented cases of GPT-4 and GPT-3.5 generating fabricated legal citations, non-existent academic references, and incorrect technical specifications in user interactions. OpenAI recommends users verify critical information independently and notes that model confidence scores do not reliably correlate with factual accuracy.

The post does not announce new technical solutions or provide timelines for further hallucination reduction. It frames the issue as an inherent limitation of current large language model architectures rather than a temporary engineering challenge.

Independent researchers have noted similar hallucination patterns across providers including Anthropic's Claude, Google's Gemini, and Meta's Llama models, suggesting the behaviour reflects fundamental characteristics of transformer-based architectures trained on internet-scale text corpora.

Why this is an AI incident

Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.

Counterfactual "but-for" test per the Editor's Guide.

Codes M1, F10
Providers OpenAI