[openai-blog] Announcing the OpenAI Safety Fellowship

SEV-3OpenAI

[openai-blog] Announcing the OpenAI Safety Fellowship

2026-05-10 2 sources standard

OpenAI announced a new Safety Fellowship programme on 6 April 2026, aimed at funding external researchers to investigate AI safety challenges [source]. The fellowship will provide grants and compute resources to academic and independent researchers working on alignment, interpretability, and adversarial robustness.

The programme represents a shift in OpenAI's approach to safety research, moving some evaluation work outside the company's internal teams. Fellows will receive access to model APIs and structured feedback channels, though the announcement does not specify whether they will gain access to model weights or training infrastructure.

OpenAI stated the fellowship will run for twelve months initially, with potential for extension based on outcomes. The company did not disclose the total budget or number of fellows to be selected. Applications open in May 2026, with the first cohort expected to begin work in July.

The announcement follows a period of public scrutiny over OpenAI's safety practices. In March 2026, former employees published an open letter questioning the company's internal safety protocols. OpenAI has not directly addressed those claims, but the fellowship announcement references "expanding the community of researchers who can independently assess our systems" [source].

The fellowship structure differs from OpenAI's existing red-teaming programmes, which focus on short-term adversarial testing. Fellows will conduct longer-term research projects, with findings expected to be published openly. OpenAI stated it will not require pre-publication review, though the company reserves the right to withhold access if researchers violate terms of service.

The programme does not address concerns raised by critics about access to frontier models during training or fine-tuning stages.

Why this is an AI incident

Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.

Counterfactual "but-for" test per the Editor's Guide.

Codes M1, F10

Providers OpenAI