[openai-blog] gpt-oss-120b & gpt-oss-20b Model Card
OpenAI has released two new open-source models, GPT-OSS-120B and GPT-OSS-20B, marking a departure from its closed development approach [source]. The models are available under permissive licensing for research and commercial use.
According to the model card published 5 August 2025, GPT-OSS-120B contains 120 billion parameters and GPT-OSS-20B contains 20 billion parameters. Both models were trained on a dataset spanning web text, books, and code through early 2025. OpenAI states the models achieve "competitive performance" on standard benchmarks but does not provide direct comparisons to its proprietary GPT-4 or GPT-4o systems.
The model card acknowledges several limitations. Both models exhibit higher rates of factual errors compared to OpenAI's production systems. The 20B variant shows "notable degradation" on complex reasoning tasks. OpenAI reports both models can generate biased or harmful content and recommends additional safety filtering for deployment.
The release includes model weights, inference code, and evaluation scripts. OpenAI has not disclosed the full training dataset composition, citing competitive concerns, but provides aggregate statistics on data sources and filtering methods.
The model card notes the models were trained using techniques similar to those described in OpenAI's earlier research papers but adapted for open release. No information is provided about whether these models share architecture or training data with OpenAI's commercial offerings.
OpenAI states the release is intended to support academic research and to provide baseline models for fine-tuning. The company has not announced plans to provide ongoing updates or support for these open-source releases.
Why this is an AI incident
Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.
Counterfactual "but-for" test per the Editor's Guide.