OpenAI, Anthropic team up for research on hallucinations, jailbreaking

OpenAI and Anthropic, two of the biggest rivals in artificial intelligence, recently evaluated each others’ models in an effort to better understand issues that their own tests may have missed.

In posts on both companies’ blogs on Wednesday, OpenAI and Anthropic said that over the summer they ran evaluations for safety on the other company’s publicly available AI models. They also tested for any propensity to make up facts and misalignment, a term commonly used to refer to an AI model not doing what the people building it want it to do.

The companies are high-profile competitors — Anthropic was founded by former OpenAI employees — making the collaboration notable. OpenAI called the joint safety effort the “first major cross-lab exercise in safety and alignment testing,” adding that the group hoped it would provide a “valuable path to evaluate safety at an industry level.”

AI companies are under increasing pressure to focus on the safety of their products following a string of reports of harmful behaviour linked to heavy use of the models. Most recently, a lawsuit was filed against OpenAI earlier this week alleged a teenager died by suicide after using the chatbot as a coach.

The companies carried out the evaluations before OpenAI released its new flagship AI model, GPT-5, and Anthropic rolled out the latest update to its Claude Opus model, Opus 4.1, in early August

More stories like this are available on bloomberg.com

Published on August 28, 2025

Source link

Times Top News

Times Top News

OpenAI, Anthropic team up for research on hallucinations, jailbreaking

Submit a Comment Cancel reply

Recent Posts

Recent Comments