[openai-blog] Introducing Structured Outputs in the API

SEV-3OpenAI

[openai-blog] Introducing Structured Outputs in the API

2026-05-10 2 sources standard

OpenAI announced on 6 August 2024 that it had introduced Structured Outputs in its API, a feature designed to ensure model responses match developer-defined JSON schemas with guaranteed reliability [source]. The company stated that developers can now supply a JSON schema and receive outputs that strictly adhere to it, eliminating a class of errors where models previously returned malformed or non-conforming JSON.

According to the announcement, Structured Outputs are available via two mechanisms: a new parameter in the Chat Completions API called `response_format: { type: "json_schema" }`, and a new way to define function call schemas that ensures 100% reliability. OpenAI reported that in their evaluations, the feature achieved a perfect score on a complex schema adherence benchmark, compared to 40% for models using standard JSON mode.

The feature addresses a longstanding issue in production deployments where models would occasionally omit required fields, invent extra properties, or return syntactically invalid JSON despite instructions. OpenAI noted that developers previously resorted to retry logic, validation layers, and prompt engineering to mitigate these failures.

Structured Outputs are supported in GPT-4o, GPT-4o mini, and future models, with the company stating that the feature uses constrained decoding to enforce schema compliance. OpenAI also disclosed that refusals—cases where the model declines to fulfil a request—are now surfaced as a distinct `refusal` field rather than as malformed content.

The announcement did not specify whether the feature incurs additional latency or cost. Developers using earlier JSON mode implementations may need to migrate schemas to the new format to benefit from guaranteed adherence.

Why this is an AI incident

Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.

Counterfactual "but-for" test per the Editor's Guide.

Codes M1, F10

Providers OpenAI