The Hours That Changed Cybersecurity: What Anthropic’s Mythos Test Means for Every Organisation

June 24, 2026 seanhackacademy

Viral claims that an artificial intelligence model hacked the NSA distorted what was an authorised security exercise. The verified account, however, may be even more consequential. AI is rapidly compressing the time needed to discover weaknesses, giving defenders a shrinking window in which to act.

The most alarming part of the Anthropic Mythos story is not the sensational claim that an AI system hacked one of the world’s most powerful intelligence agencies.

That did not happen.

The real warning is that an advanced AI model reportedly identified vulnerabilities across highly sensitive US government systems within hours during a controlled security assessment. Work that might once have required teams of experienced specialists, extensive preparation and days of investigation was apparently accelerated into a far shorter period.

Senator Mark Warner said he had been briefed by General Joshua Rudd, the head of the National Security Agency and US Cyber Command, about the model’s performance. Warner described Mythos as gaining access to almost all the classified systems included in the exercise within hours rather than weeks. A US official subsequently clarified that the model identified vulnerabilities quickly, but that this did not necessarily mean it successfully exploited every weakness within the same period. The NSA and Anthropic have not publicly disclosed the technical results.

That distinction matters. Discovering a vulnerability is not the same as completing an intrusion, stealing information, maintaining persistent access or causing operational damage. There is no public evidence that Mythos conducted an unauthorised attack or escaped from a controlled environment.

The exercise was reportedly carried out as part of Project Glasswing, Anthropic’s programme for placing advanced cyber capabilities in the hands of vetted defenders and organisations responsible for critical software.

Yet dismissing the story as nothing more than an exaggerated social media rumour would be equally dangerous.

The test offers a glimpse of a cybersecurity environment in which AI can inspect systems, analyse code, identify misconfigurations and assemble potential attack paths at a speed that human teams may struggle to match.

What actually happened

The story accelerated after The Economist reported Warner’s description of the government testing. Social media posts soon transformed a complicated red-team exercise into a much simpler and more dramatic narrative, that Anthropic’s AI had independently broken into the NSA.

The journalist behind the original report later cautioned that Warner’s description should not be interpreted literally. Mythos was apparently used alongside other tools, with access and instructions provided under specific testing conditions. It was not an uncontrolled system choosing to launch a real attack against the US intelligence community.

Several essential details remain unknown.

There is no public list of the systems involved, the vulnerabilities discovered, the permissions given to the model or the tools available during the assessment. It is also unclear whether the reference to almost all classified systems covered a limited set of test targets or a much larger government environment.

Without that information, sweeping conclusions about Mythos defeating the NSA are not justified.

But controlled conditions do not make the results meaningless. Red-team assessments are designed to test what a capable adversary might achieve while allowing weaknesses to be identified before a genuine attacker finds them. The relevant question is not whether Mythos committed a real breach. It is what its performance reveals about the future capabilities available to attackers and defenders.

From assistant to autonomous operator

Most public discussion of generative AI has focused on chatbots that write emails, produce images, summarise documents or generate software code. Mythos represents something more consequential, an AI system capable of carrying out extended sequences of technical actions.

Britain’s AI Security Institute reported that Claude Mythos Preview could execute multistage attacks against vulnerable test networks when it was explicitly directed and given network access. On expert-level capture-the-flag challenges, it succeeded 73 per cent of the time.

In a separate 32-step simulated corporate network attack, Mythos became the first model tested by the institute to complete the entire exercise. It succeeded in three of ten attempts and completed an average of 22 steps across all runs. The institute estimated that a human professional would require about 20 hours to finish the full simulation.

These results do not mean an AI system can effortlessly compromise any network. The model was working inside an artificial environment, with a defined objective and access to the necessary tools. Real networks are inconsistent, noisy and frequently protected by controls that do not appear in laboratory exercises.

Nevertheless, the direction of progress is difficult to ignore.

AI systems are moving beyond offering advice to performing reconnaissance, reviewing source code, identifying weaknesses, attempting exploitation and adapting when one method fails. They can preserve context across longer tasks and repeat technical processes without fatigue.

The importance of this development lies in scale. A highly skilled penetration tester can examine a limited number of systems at any one time. An AI-enabled security operation could potentially repeat similar assessments across numerous applications and environments, helping defenders find weaknesses more quickly.

The same capability could help an attacker do the same thing.

Project Glasswing and the race to find vulnerabilities first

Anthropic introduced Project Glasswing in April 2026 after concluding that Mythos-class models had reached a significant threshold in their ability to discover and exploit software vulnerabilities.

The company initially gave access to a limited group of security teams, technology companies and critical infrastructure providers. Anthropic later expanded the programme to approximately 150 organisations across more than 15 countries.

According to Anthropic, Glasswing participants used Mythos Preview to uncover thousands of high-severity flaws in important software. The company’s stated objective was to give defenders an opportunity to find and repair vulnerabilities before comparable AI capabilities became more widely available.

That approach reflects a fundamental change in vulnerability management.

For years, organisations have relied on a familiar cycle. A researcher discovers a flaw, the vendor investigates it, a patch is developed, customers are notified and administrators deploy the update. Attackers may then reverse-engineer the patch or exploit organisations that respond slowly.

Advanced AI threatens to compress every stage of that cycle.

A model capable of analysing millions of lines of code can search for previously undiscovered flaws. Once an update is published, AI can compare the old and new versions to determine what changed. It may then help create exploit code or identify organisations that remain vulnerable.

This does not eliminate the need for human expertise. Findings must still be validated, prioritised and safely disclosed. False positives can consume valuable time, while poorly managed disclosure could expose weaknesses before fixes are ready.

What changes is the pace. Security teams that previously had weeks to react may increasingly have days or hours.

Why the US restricted Fable 5 and Mythos 5

The government testing emerged amid a separate dispute over access to Anthropic’s newest models.

On June 12, the US government directed Anthropic to prevent foreign nationals from accessing Fable 5 and Mythos 5, including foreign-national employees working for the company inside the United States. Anthropic responded by disabling the models for its customers, saying it could not immediately guarantee compliance with the nationality requirement while continuing normal access.

Fable 5 was developed as a more broadly available version of the same underlying model, with additional safeguards intended to detect and block potentially dangerous requests. Mythos 5 remained restricted to vetted partners because of its stronger cybersecurity and biological research capabilities.

Anthropic said the government’s directive did not include a detailed explanation of the national security concern. The company believed the decision related to a method of bypassing Fable 5’s safeguards, which had reportedly been demonstrated while identifying a small number of previously known vulnerabilities. Anthropic argued that the findings were minor and could also be produced by other publicly available models. That remains the company’s account, rather than an independently verified description of the government’s reasoning.

The close timing between the government exercise, Warner’s comments and the access restrictions has encouraged speculation that Mythos’ performance triggered the directive. No public evidence has established that connection.

The episode nevertheless exposes a difficult policy dilemma. Restricting a powerful model may reduce the risk that it is misused, but it can also remove an important tool from legitimate defenders. If competing models develop similar capabilities, withdrawing one system may do little to constrain attackers while slowing down the organisations trying to protect critical infrastructure.

The attacker’s advantage is time

Cybersecurity has always involved an imbalance.

Defenders must protect every important account, device, application, cloud environment and supplier relationship. Attackers need to find only one usable route inside.

AI could widen that imbalance by making reconnaissance and vulnerability research faster and cheaper. It can help analyse public information about employees, technologies and suppliers. It can support the creation of convincing phishing messages, examine stolen information and suggest ways to exploit insecure configurations.

The UK National Cyber Security Centre expects AI to increase the frequency and intensity of cyber intrusions through more effective reconnaissance, vulnerability research, exploit development, social engineering and data analysis. It has warned that critical systems could become more vulnerable if defensive practices fail to keep pace with frontier AI capabilities.

The Mythos test demonstrates why organisations cannot assume they are safe simply because their vulnerabilities are obscure.

Security through obscurity depends on weaknesses remaining difficult to find. AI progressively removes that protection. A forgotten server, an exposed cloud credential, an outdated dependency or an overly privileged account may now be discovered by automated systems capable of testing possible attack paths continuously.

The central challenge is therefore not whether organisations can prevent every vulnerability. They cannot. It is whether they can reduce exposure, detect malicious activity and recover before an attacker turns a weakness into a serious incident.

Cybersecurity fundamentals are becoming more important, not less

Advanced AI may be changing the threat landscape, but the most effective response begins with established security disciplines.

Organisations need an accurate inventory of their systems, applications, data, suppliers and internet-facing services. A company cannot protect an asset it does not know exists. Unsupported software, abandoned cloud resources and forgotten accounts should be treated as potential entry points.

Security updates must be deployed according to risk rather than convenience. Internet-facing vulnerabilities and flaws known to be actively exploited deserve immediate attention. The faster AI becomes at analysing vulnerabilities, the less time defenders can afford to spend waiting for the next scheduled maintenance window.

Identity controls are equally important. Phishing-resistant multifactor authentication, carefully managed administrator privileges and rapid removal of inactive accounts can prevent a stolen password from becoming a network-wide compromise.

Networks should be segmented so that gaining access to one system does not provide unrestricted movement across the organisation. Logging and monitoring must be detailed enough to reveal unusual behaviour, while incident response plans should be tested through exercises rather than left unread until a crisis begins.

Reliable backups also need to be isolated, monitored and regularly restored during testing. A backup that has never been tested is only an assumption.

These practices align with the broader lifecycle set out in the NIST Cybersecurity Framework, which organises cyber risk around governing, identifying, protecting, detecting, responding and recovering. The framework makes clear that cybersecurity is not a single product or annual compliance exercise. It is a continuous business process.

The NCSC has similarly advised organisations to reduce unnecessary exposure, deploy security updates quickly, monitor for malicious activity and prepare to respond. Its message is direct, AI will increasingly expose organisations that have failed to establish a strong security baseline.

Technology cannot compensate for unprepared people

The Mythos story may appear to be about classified networks, frontier models and national security policy, but its lesson applies to organisations of every size.

AI-powered security tools will become more capable. So will AI-assisted attackers. Purchasing another security platform will not be enough if employees cannot recognise suspicious activity, developers do not understand secure design, administrators overlook dangerous configurations or leaders do not know how to respond when an incident occurs.

Cybersecurity depends on decisions made throughout an organisation.

An employee decides whether to open an unexpected attachment. A developer decides how an application validates input. An administrator determines who receives privileged access. A manager decides whether a critical patch can be delayed. An executive determines whether cybersecurity is treated as a technical expense or a business priority.

Training turns those individual decisions into a coordinated defence.

Build your defence before AI finds the weakness

The lesson from Mythos is not that resistance is hopeless. It is that waiting has become far more dangerous.

The next serious test of your systems may not begin with a large team of elite hackers. It may begin with an automated tool capable of inspecting weaknesses, testing attack paths and adapting its approach in a fraction of the time previously required.

The organisations most likely to withstand that pressure will be those that invest in knowledge before an incident begins.

The Hack Academy’s online training programme provides a practical route for individuals and teams to strengthen that knowledge. Its self-paced courses cover cybersecurity fundamentals, Linux, cryptography, networking, internet vulnerabilities, cloud security and penetration testing. The programme combines theory with practical exercises, with optional challenge labs designed to help learners apply their skills in controlled environments.

Do not wait for an AI-powered attacker to conduct the first serious audit of your defences.

Take control of your cybersecurity readiness. Develop the skills to recognise threats, understand vulnerabilities and protect the systems that matter. Start The Hack Academy’s online training programme and build a stronger defence today.

Photo Credit: DepositPhotos.com