How Anthropic Learned Mythos Was Too Dangerous for Release

TL;DR

Anthropic's AI researcher Nicholas Carlini tested the new Mythos model remotely from Bali and found it capable of hacking core computing systems.[[1]](https://www.bloomberg.com/news/features/2026-04-16/how-anthropic-discovered-mythos-ai-was-too-dangerous-for-release)
Mythos autonomously discovered thousands of zero-day vulnerabilities in every major OS and browser, including a 27-year-old OpenBSD flaw and a 16-year-old FFmpeg bug.[[2]](https://www.anthropic.com/glasswing)
The company withheld public release to avoid misuse, instead limiting access via Project Glasswing to help partners patch flaws before attackers exploit them.[[2]](https://www.anthropic.com/glasswing)

The story at a glance

Anthropic decided not to release its advanced AI model Mythos publicly after internal tests revealed its exceptional ability to find and exploit software vulnerabilities in critical systems. AI researcher Nicholas Carlini, testing from Bali during a wedding, quickly uncovered techniques to infiltrate widely used computing infrastructure, prompting warnings from the company's experts. Banks, governments like the US Treasury under Secretary Scott Bessent, and partners in Project Glasswing—including Amazon, Apple, Google, Microsoft, and others—are now racing to assess and mitigate risks from this new era of AI-driven threats. This comes amid Anthropic's April 7 announcement of Mythos Preview and Project Glasswing.[[2]](https://www.anthropic.com/glasswing)[[1]](https://www.bloomberg.com/news/features/2026-04-16/how-anthropic-discovered-mythos-ai-was-too-dangerous-for-release)

Key points

In February, Anthropic made Mythos available for internal review; Carlini, paid by the company to stress-test models, used it remotely and found it could enable espionage, theft, or sabotage by hacking low-level systems.[[1]](https://www.bloomberg.com/news/features/2026-04-16/how-anthropic-discovered-mythos-ai-was-too-dangerous-for-release)
Mythos identified thousands of high-severity zero-days, such as a flaw in OpenBSD for remote crashes (unpatched 27 years), one in FFmpeg (16 years, survived 5 million tests), and chained Linux kernel exploits for root access.[[2]](https://www.anthropic.com/glasswing)
The model excels in benchmarks like 83.1% on CyberGym (vs. prior 66.6%), showing step-change in autonomous vulnerability discovery and multi-step attacks without much human input.[[2]](https://www.anthropic.com/glasswing)
Anthropic launched Project Glasswing on April 7, granting limited access to 12 launch partners (AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, Nvidia, Palo Alto, Broadcom) plus 40 others for defensive patching.[[2]](https://www.anthropic.com/glasswing)
Committing $100M in credits and $4M donations to open-source security, Anthropic requires partners to report findings publicly within 90 days.[[2]](https://www.anthropic.com/glasswing)
US Treasury's Scott Bessent and others met bank CEOs like those from Bank of America and Goldman Sachs to urge defenses; UK AI Security Institute has access and calls it a cyber threat step-up.[[3]](https://www.bloomberg.com/opinion/articles/2026-04-15/anthropic-mythos-ai-is-a-wake-up-call-for-everyone-not-just-banks)

Details and context

Mythos, part of Anthropic's Claude series, marks a shift where AI can outperform most humans in coding tasks like vulnerability hunting, chaining exploits autonomously—far beyond prior models like Claude Opus 4.6, which succeeded only twice on tough Firefox tests vs. Mythos's 181 times. Even non-experts can use it overnight for attacks, raising fears of rapid proliferation to bad actors.[[2]](https://www.anthropic.com/glasswing)

Anthropic's system card details mitigations like probe classifiers for misuse monitoring (e.g., worms, exploits) but notes restricted access since general release risks outweigh benefits now. The model saturated many benchmarks, prompting Responsible Scaling Policy updates.[[4]](https://www.anthropic.com/claude-mythos-preview-system-card)

This follows a data leak revealing Mythos in March; responses include global regulator meetings, with some questioning hype but most affirming the capability jump per UK's AISI evaluations.[[5]](https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities)

Key quotes

"I've found more bugs in the last few weeks with Claude Mythos than in the rest of my entire life combined." — Nicholas Carlini, AI researcher affiliated with Anthropic and Google DeepMind.[[6]](https://mashable.com/article/claude-mythos-preview-project-glasswing-pr-stunt-cybersecurity-experts)

"Dangerous stuff." — Dario Amodei, Anthropic CEO.[[3]](https://www.bloomberg.com/opinion/articles/2026-04-15/anthropic-mythos-ai-is-a-wake-up-call-for-everyone-not-just-banks)

Why it matters

AI models like Mythos signal a cybersecurity tipping point where automated attacks could outpace human defenses, threatening economies, infrastructure, and national security if misused. Companies and users face urgent needs to harden software stacks, while investors eye AI safety firms; banks and tech giants must prioritize patching as zero-days surface faster. Watch government access demands, Glasswing patching results in 90 days, and safeguards in Anthropic's next Opus release, though similar capabilities may soon proliferate regardless.[[2]](https://www.anthropic.com/glasswing)