← Latest · Archive

SEV-3OpenAI
2 sources standard

OpenAI published research on 14 September 2017 describing work on training neural networks to model the mental states of other agents — a capability known as theory of mind [source]. The research focused on teaching AI systems to predict what other agents believe, even when those beliefs differ from reality.

The team trained networks using a dataset of scenarios where agents observe partial information about their environment. The models learned to distinguish between what an agent knows and what actually exists in the world, demonstrating rudimentary perspective-taking abilities. Performance was measured against tasks requiring inference about hidden information and false beliefs.

The research represents early work in machine theory of mind, predating OpenAI's large language model deployments. The paper noted that while the networks showed promise on structured tasks, they operated in constrained environments with explicit agent representations. The authors acknowledged limitations in generalisation and the gap between these controlled experiments and real-world social reasoning.

This work preceded public releases of GPT-2 (2019) and GPT-3 (2020). The research direction explored whether neural architectures could learn social cognition primitives from data, rather than relying on hand-coded rules about mental states. The paper described the approach as a step toward AI systems that better understand human intentions and beliefs.

The publication appeared on OpenAI's blog as part of the organisation's research output during its transition from pure research lab toward product development. No commercial applications were announced at the time. The work contributed to broader academic discussion about whether statistical learning methods could acquire capabilities traditionally associated with human social intelligence.

Why this is an AI incident

Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.

Counterfactual "but-for" test per the Editor's Guide.

Codes M1, F10
Providers OpenAI