Anthropic investigates unauthorized access to Mythos AI model
Anthropic is investigating reports that its restricted Mythos AI model, a system designed for advanced cybersecurity work and considered too risky for public release, was accessed by a small group of unauthorized users through a third-party vendor environment. Bloomberg first reported the breach, while Anthropic has said it is reviewing the incident and currently has no evidence that its core internal systems were affected.
Mythos, formally referred to as Claude Mythos Preview, was introduced on April 7 as part of Project Glasswing, Anthropic’s initiative to help secure critical software using a highly capable model. Anthropic has described the system as able to identify and exploit vulnerabilities across major operating systems and web browsers when instructed by a user, which is why access has been limited to a small group of major technology and security partners rather than opened to the public. Those partners include companies such as Amazon Web Services, Apple, Google, Microsoft and Nvidia.
According to the reporting, the unauthorized access began on April 7, the same day Anthropic publicly launched the model for limited testing. Bloomberg reported that the access was allegedly obtained by members of a private online group focused on unreleased AI systems, with help from a third-party contractor and publicly available investigative techniques. The same report said the group used knowledge gleaned from Anthropic-related model formats exposed in a recent Mercor data breach to make an educated guess about where the model could be found online.
The incident is likely to sharpen scrutiny around how frontier AI models are secured, especially when their most sensitive capabilities are made available through outside vendor environments. Anthropic has positioned Mythos as a model with unusually strong cyber capabilities and has explicitly said it does not currently plan a public release because of the risk that the system could be weaponized.
Bloomberg’s report indicates the group had access to the model for roughly two weeks and claimed to have used it regularly, though not for overt cybersecurity tasks, apparently in an effort to avoid detection. Screenshots and a live demonstration were reportedly shown as evidence. The report also said other unreleased Anthropic models may have been accessed by the same group. Anthropic has not publicly confirmed those broader claims.
The episode underscores a growing tension in the AI industry. Companies are racing to develop systems powerful enough to defend critical infrastructure and software, while at the same time confronting the reality that those same tools could become dangerous if accessed by the wrong people. For Anthropic, a company that has made safety a central part of its public identity, the Mythos incident is likely to become a closely watched test of how securely frontier models can be contained once they move beyond a company’s own walls.
Photo Credit: DepositPhotos.com
