[openai-blog] Our First Proof submissions

SEV-3OpenAI

[openai-blog] Our First Proof submissions

2026-05-10 2 sources standard

OpenAI announced on 20 February 2026 that it has begun accepting proof submissions from users, marking a shift in how the company handles claims about model behaviour [source]. The initiative follows sustained criticism that OpenAI has historically dismissed user reports of output drift, hallucinations, and capability regressions without providing reproducible evidence or transparent investigation processes.

Under the new system, users can submit structured documentation of alleged model failures, including prompt logs, timestamps, and comparative outputs across model versions. OpenAI stated it will review submissions and publish findings for cases it deems "substantiated and reproducible." The company did not specify response timelines, review criteria, or whether submissions will be handled by automated systems or human reviewers.

The announcement comes amid growing regulatory scrutiny of AI providers' accountability mechanisms. In January 2026, the European Union's AI Office issued guidance requiring providers to maintain auditable records of model behaviour changes. OpenAI's proof submission system appears designed to address compliance requirements while managing the volume of user-reported incidents.

Critics noted the initiative places the burden of proof on users rather than requiring OpenAI to proactively monitor and disclose model changes. The company has not committed to publishing negative findings or cases where submissions are rejected, raising questions about transparency. OpenAI previously faced backlash in 2024 when users reported widespread degradation in GPT-4's coding capabilities, claims the company initially characterised as "confirmation bias" before later acknowledging optimisation changes had affected performance.

The proof submission portal is accessible via OpenAI's developer platform. The company stated it will publish aggregated statistics on submission volume and resolution rates quarterly.

Why this is an AI incident

Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.

Counterfactual "but-for" test per the Editor's Guide.

Codes M1, F10

Providers OpenAI