[openai-blog] New and improved embedding model
OpenAI released a new embedding model on 15 December 2022, replacing its previous generation with a single unified model designated `text-embedding-ada-002`. The provider stated the new model outperforms all prior embedding models across a majority of text search, code search, and sentence similarity benchmarks while reducing cost by 99.8% [source].
The update consolidated five previous models—text-similarity, text-search-query, text-search-doc, code-search-text, and code-search-code—into one endpoint. OpenAI reported the new model achieves stronger performance on text search evaluations and comparable or improved results on code search tasks compared to its predecessors.
The provider disclosed that `text-embedding-ada-002` is not trained on customers' data sent via the API. The model produces embeddings normalized to length 1, meaning cosine similarity can be computed using a dot product, which the provider noted may simplify implementation for some users.
Pricing for the new model was set at $0.0004 per 1,000 tokens, down from $0.200 per 1,000 tokens for the previous Davinci-based embedding model. OpenAI stated the model supports a maximum input length of 8,191 tokens.
The provider recommended users migrate from older embedding models, noting that the legacy endpoints would remain available for a transition period. OpenAI published benchmark results showing the new model's performance across multiple evaluation datasets, including a reported 31.4% improvement on one text search benchmark and 20.9% improvement on a code search task compared to prior models.
The announcement did not specify the model's architecture, training data composition, or training completion date beyond the December release.
Why this is an AI incident
Launch-archive bulk classification (10 May 2026). Source signal originates from a real AI provider, regulator, or model-comparison probe; the harm or behavioural change described would not have occurred without the AI system being deployed in the role described. Editor reviewing the archive may amend the rationale per-wire.
Counterfactual "but-for" test per the Editor's Guide.