[openai-blog] Advancing science and math with GPT-5.2
OpenAI announced GPT-5.2 on 11 December 2025, positioning the model as a specialist system for scientific and mathematical reasoning [source]. The provider states the model was trained on curated datasets from peer-reviewed journals, arXiv preprints, and mathematical competition problems, with reinforcement learning from domain experts.
According to the changelog, GPT-5.2 introduces a "multi-step verification mode" that generates intermediate proof steps and checks them against symbolic solvers before returning answers. OpenAI reports accuracy improvements on benchmarks including MATH (94.2%, up from 90.1% in GPT-4.5), GPQA Diamond (78.3%), and a subset of IMO problems (5 of 6 solved in testing).
The model is available via API and ChatGPT Pro subscriptions. OpenAI states it uses extended context windows—up to 128,000 tokens—and can process LaTeX, chemical structure files, and tabular data. The provider notes the model may refuse tasks outside its training distribution, returning a message that directs users to general-purpose models.
No independent benchmarking has been published at the time of announcement. The provider did not release technical papers, training data composition, or details of the reinforcement learning protocol. OpenAI states the model was evaluated by researchers at undisclosed institutions prior to launch.
The announcement follows a pattern of iterative releases in OpenAI's GPT-5 series, which began in mid-2025. The provider has not disclosed whether GPT-5.2 shares architecture with earlier GPT-5 variants or represents a distinct training run. Pricing is set at $0.12 per 1,000 input tokens and $0.48 per 1,000 output tokens for API access.
Why this is an AI incident
Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.
Counterfactual "but-for" test per the Editor's Guide.