GPT-5.5 matches Anthropic’s Mythos in UK cyber evaluations, raising wider questions about AI security risk
OpenAI’s newly released GPT-5.5 has matched Anthropic’s closely guarded Mythos Preview model in advanced cybersecurity testing, according to new results from the UK’s AI Security Institute, suggesting that the cyber capabilities drawing concern around frontier AI may not be limited to one company or one model.
The findings come after Anthropic restricted the release of Mythos Preview to critical industry partners, citing its powerful cybersecurity capabilities. But AISI’s latest assessment found that GPT-5.5 achieved “a similar level of performance” on its cyber evaluations, indicating that the trend may reflect broader improvements in frontier AI reasoning, coding and long-horizon autonomy rather than a unique breakthrough by Anthropic.
Since 2023, AISI has tested leading AI models against 95 Capture the Flag challenges, which are designed to assess cybersecurity capabilities including reverse engineering, web exploitation and cryptography. On the institute’s highest-level “Expert” tasks, GPT-5.5 passed an average of 71.4 per cent, slightly ahead of Mythos Preview’s 68.6 per cent, although AISI said the results were within the margin of error.
In one notably difficult challenge, which required building a disassembler to decode a Rust binary, GPT-5.5 solved the task in 10 minutes and 22 seconds without human assistance, at a reported API cost of US$1.73. The result is likely to intensify debate about how quickly advanced AI systems are becoming useful for complex cyber tasks that previously required specialist human expertise.
GPT-5.5 also performed strongly on “The Last Ones”, an AISI test range simulating a 32-step data extraction attack on a corporate network. GPT-5.5 succeeded in three out of 10 attempts, compared with two out of 10 for Mythos Preview. Before Mythos, no tested model had completed the simulation even once.
The model did not succeed on AISI’s more difficult “Cooling Tower” test, which simulates an attempted disruption of control software for a power plant. AISI noted that every model it has tested has also failed that scenario, indicating that frontier systems still fall short on some of the most complex cyber-physical tasks.
The results complicate the narrative around Mythos Preview, which Anthropic has positioned as a model with unusually serious cybersecurity implications. AISI said GPT-5.5’s performance suggests the risk may be “a byproduct of more general improvements in long-horizon autonomy, reasoning, and coding,” rather than a breakthrough specific to Mythos.
The issue has already become a point of tension between major AI companies. In a recent interview, OpenAI chief executive Sam Altman criticised what he described as “fear-based marketing” around restricted AI model releases, arguing that companies have an incentive to portray models as uniquely dangerous while selling access to protective tools or controlled environments.
OpenAI has also moved toward controlled access for high-capability cyber tools. In February, the company introduced its Trusted Access for Cyber program, which allows verified security researchers and enterprises to use frontier AI capabilities for legitimate defensive work. OpenAI says the program is intended to accelerate defensive research while recognising that the same capabilities could be misused by malicious actors.
Last month, OpenAI expanded the program and introduced GPT-5.4-Cyber, a model variant fine-tuned for additional cybersecurity capabilities and made available through higher levels of trusted access. OpenAI said those tiers were designed for verified cyber defenders and organisations responsible for protecting critical software and infrastructure.
Altman said this week that GPT-5.5-Cyber would also initially be limited to “critical cyber defenders”, reflecting the growing industry consensus that advanced cyber-capable models may need stricter deployment controls than general-purpose consumer systems.
The broader concern is that AI systems capable of helping defenders identify vulnerabilities, test infrastructure and prioritise patches could also help attackers discover weaknesses, automate reconnaissance or accelerate intrusion attempts. AISI’s findings suggest that those capabilities are emerging across the frontier model landscape, not just in specialised systems marketed for cybersecurity.
For policymakers, researchers and technology companies, the results raise a difficult question: how to give legitimate defenders access to increasingly powerful tools without also increasing the scale and sophistication of cyber threats.
The answer is likely to shape the next phase of AI regulation and deployment. As models become better at coding, reasoning and planning across long task chains, cybersecurity may become one of the first areas where frontier AI’s dual-use nature becomes impossible to ignore.
In a landscape where AI is rapidly changing both the tools available to defenders and the tactics available to attackers, knowledge is one of the strongest forms of protection. Whether you are a business owner, employee, developer, or everyday internet user, understanding the basics of cybersecurity can help you recognise risks earlier, make safer decisions, and respond with greater confidence.
Build your cyber confidence with The Hack Academy’s online courses and start strengthening your digital safety skills today: https://training.thehackacademy.com/course/
Photo Credit: DepositPhotos.com
