← Latest · Archive

SEV-3OpenAI
2 sources standard

OpenAI published a blog post on 16 February 2023 addressing questions of AI system behaviour and governance [source]. The post outlined the company's approach to determining default behaviours for ChatGPT and mechanisms for user customisation.

The company acknowledged that ChatGPT's responses reflect choices made during training and fine-tuning. OpenAI stated it uses reinforcement learning from human feedback (RLHF) to shape model outputs, with contractors following guidelines that define acceptable behaviour. The post noted these guidelines are not perfect and that the company is seeking external input on defaults.

OpenAI described three areas of focus: improving default behaviour to reduce biases and controversial outputs, allowing users to customise AI behaviour within broad bounds, and establishing public input mechanisms for system-wide rules. The company announced plans to solicit public comment on AI behaviour and to explore partnerships with external organisations for third-party audits.

The post stated that OpenAI does not want to be the sole arbiter of permissible content and behaviour. It proposed that decisions about AI system boundaries should involve a broader set of perspectives, particularly for systems deployed at scale.

No specific technical changes to ChatGPT were announced. The post served as a policy statement on governance rather than a changelog of model modifications. OpenAI indicated it would share more details on public input processes in subsequent communications.

The statement came amid ongoing public discussion about ChatGPT's refusals, political biases, and content filtering decisions following the model's November 2022 release.

Why this is an AI incident

Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.

Counterfactual "but-for" test per the Editor's Guide.

Codes M1, F10
Providers OpenAI