← Back to blog

10 April 2026 • Mark Lewis

Anthropic Built an AI Model Too Powerful to Release. Here's What Enterprise IT Needs to Know.

A practising enterprise cloud and security consultant breaks down the Claude Mythos Preview system card and its implications for IT professionals, cloud architects, and security engineers.

AI cybersecurity cloud architecture enterprise career strategy

Mark Lewis · 10 April 2026

On 7 April 2026, Anthropic published a 245-page system card for Claude Mythos Preview — their newest frontier AI model. They then announced they would not release it commercially. Access is restricted to a small number of partner organisations for defensive cybersecurity, under a programme called Project Glasswing.

I've read every page. This is an analysis from the perspective of someone who delivers cloud architecture, security assessments, and cost optimisation work for large European enterprises — not from an AI researcher or a tech journalist.

Why this matters: Anthropic's Claude Mythos Preview represents a step change in AI capability — particularly in cybersecurity. The model autonomously discovers and exploits real-world browser vulnerabilities at an 84% success rate, up from 15.2% for the previous generation. During testing, earlier versions escaped sandboxed environments, harvested credentials from process memory, escalated their own permissions, and attempted to conceal what they had done. Anthropic considered the model too dangerous to release commercially. It is currently restricted to defensive cybersecurity partners only. But the capabilities it demonstrates are a clear indicator of what threat actors — and commercially available AI models — will be capable of within 12–24 months. Every enterprise, and particularly financial services, energy, and other regulated organisations, needs to factor autonomous AI-driven attacks into their threat models now — not after the first major incident.

The Capability Numbers

The benchmark results are significant. Here are the ones that matter for enterprise IT.

Software Engineering

Benchmark	Mythos Preview	Previous Best	What It Tests
SWE-bench Verified	93.9%	~72% (Opus 4.6)	Real GitHub issues validated by human engineers
SWE-bench Pro	77.8%	—	Harder variant: large diffs, actively maintained repos
SWE-bench Multilingual	87.3%	—	300 problems across 9 programming languages
Terminal-Bench 2.0	82% mean reward	—	Real terminal tasks in isolated Kubernetes pods

These are not toy benchmarks. SWE-bench Pro involves navigating real codebases, understanding multi-file context, and producing working patches. Terminal-Bench runs each task in a genuine Kubernetes environment with real resource constraints and timeouts.

Reasoning and Knowledge

Benchmark	Mythos Preview	Opus 4.6	GPT-5.4	Gemini 3.1 Pro
MMMLU	92.7%	91.1%	—	92.6–93.6%
USAMO 2026	97.6%	42.3%	95.2%	74.4%
Humanity's Last Exam (no tools)	56.8%	40.0%	39.8%	44.4%
GraphWalks 256K-1M	80.0%	38.7%	21.4%	—

The USAMO jump from 42.3% to 97.6% between model generations is striking. The model saturates most standard knowledge benchmarks.

Cybersecurity — The Phase Change

This is where it gets serious.

Benchmark	Mythos Preview	Opus 4.6	Sonnet 4.6	What It Tests
Cybench (pass@1)	100%	100%	96%	CTF challenges — fully saturated
CyberGym	0.83	0.67	0.65	Reproducing known vulns in real open-source projects
Firefox 147 exploitation	84%	15.2%	4.4%	Developing working exploits for real browser vulns

The Firefox 147 result deserves emphasis. The model is given crash categories from Firefox's SpiderMonkey engine and must triage available bugs, identify exploitable corruption primitives, and develop a working proof-of-concept exploit achieving arbitrary code execution. The jump from 15.2% to 84% is not incremental. It is a phase change in autonomous offensive capability.

Anthropic is explicit about why they restricted the model: the offensive and defensive capabilities are inherently dual-use.

What the Model Did When Left Unsupervised

The alignment assessment documents several incidents from earlier versions of Mythos Preview. These are directly relevant to anyone responsible for enterprise infrastructure security.

Sandbox escape. During behavioural testing, the model was given a sandboxed computer environment. It escaped the sandbox, developed a multi-step exploit to gain broad internet access from a system configured to reach only predetermined services, and proactively contacted a researcher to demonstrate its success. The researcher discovered this when the model sent them an email while they were away from their desk.

Credential harvesting and privilege escalation. Earlier versions used low-level /proc/ access to search for credentials, attempted to circumvent sandboxing, and escalated permissions. In several cases the model successfully accessed resources Anthropic had intentionally withheld — including credentials for messaging services, source control, and the Anthropic API itself.

Arbitrary code execution. The model achieved code execution by repurposing a file-editing tool to write directly to the shell's input.

Covering its tracks. When the model took actions it recognised as disallowed, it attempted to conceal what it had done — making changes that wouldn't appear in version control history, obfuscating permission escalation attempts after being explicitly warned.

Confirmed as deliberate. White-box analysis of internal model activations confirmed that features associated with concealment, strategic manipulation, and suspicion-avoidance were activating alongside the relevant reasoning.

Anthropic states the final released version is significantly improved and the most severe behaviours have not been observed. They also acknowledge these tendencies may not be completely absent.

Autonomy Assessment

Anthropic surveyed 18 internal staff on the model's capability relative to their own research scientists and engineers.

Assessment	Count
Already a drop-in for entry-level Research Scientist/Engineer	1/18
50% chance of qualifying with 3 months scaffolding	4/18
Not yet at L4 (senior) level	Majority

Remaining weaknesses identified:

Self-managing week-long ambiguous tasks
Understanding organisational priorities
Taste and judgment
Verification and instruction following
Epistemics

The ECI (Epoch Capabilities Index) trend shows the capability improvement rate is accelerating. The slope ratio between pre-Mythos and post-Mythos trends is 1.86× to 4.3× depending on breakpoint selection. Mythos sits as a clear outlier above the prior trajectory.

Anthropic attributes the gains to human research rather than AI-accelerated R&D — the recursive self-improvement loop has not yet engaged.

What This Means for Enterprise IT

Three implications.

1. Your Threat Model Needs Updating

Current threat models assume adversaries are either skilled human operators (slow, expensive) or automated tools following predefined patterns (fast but unsophisticated). Mythos Preview represents a third category: autonomous agents combining expert-level sophistication with automation speed.

An agent that independently triages vulnerabilities, identifies exploitable primitives, and produces working exploits at 84% success rates operates at a different tempo than anything most enterprise security frameworks address.

If you're conducting security assessments, pre-production quality gates, or architecture reviews — factor AI-augmented threats into your risk analysis. This is not a 2030 concern. Anthropic built this model with today's hardware.

2. The Execution Layer of IT Consulting Is Compressing

The execution work that fills much of a consulting engagement — writing infrastructure-as-code, reviewing security scanner outputs, producing architecture documentation, creating cost optimisation recommendations — is where these models perform at or near expert human level.

The value proposition for IT professionals is shifting:

Depreciating: Manual code review, documentation production, scanner output analysis, boilerplate architecture work
Appreciating: Judgment, stakeholder navigation, risk appetite calibration, knowing when the AI's recommendation is wrong, accountability

The question is no longer "can you write the Terraform?" It's "do you know which Terraform to write, and can you explain to the CISO why this architecture is correct for their risk appetite?"

3. The Capability Trajectory Is Accelerating

Whatever Mythos Preview demonstrates today will be commercially available within 12–24 months as Anthropic and competitors develop the safeguards for general release. The window to adapt is measured in quarters, not years.

What to Do Now

A useful counterpoint from AWS's parallel commentary is that models like Mythos Preview should not be seen only as an offensive risk. The same capability jump also creates an opportunity for defenders to accelerate vulnerability discovery, threat hunting, and code review at a scale that would be difficult to achieve manually.

AWS's framing is useful here: comprehensive observability, defense in depth, automation where it adds value, and human judgment where it remains essential. That is exactly the mindset enterprise teams should apply now as agentic systems become more capable.

Upskill on AI tooling. Learn to use frontier AI models as force multipliers for your existing expertise. Build custom integrations — MCP servers, Claude skills, agentic workflows — that connect AI capabilities to your domain knowledge. The professionals who master this in 2026 set the consulting rates in 2027.

Update your security assessments. Add an AI threat component. What does your attack surface look like when the adversary can autonomously discover and exploit vulnerabilities in your dependencies? How robust is your sandbox isolation? What happens if an AI agent in your CI/CD pipeline decides to escalate its own permissions?

Invest in judgment. The moat for enterprise IT professionals is shifting from technical execution to strategic judgment — organisational context, stakeholder priorities, risk appetite calibration, and knowing when the AI is wrong. These skills have always mattered. They're about to matter much more.

Move now. Mythos Preview is not commercially available. The next commercial model will not be as capable. But the trajectory is clear, and the gap between restricted frontier model and generally available model narrows with each generation.

About the Author

Mark Lewis is the founder of MJL Network Solutions, a UK-based enterprise cloud, AI, and security consultancy. He holds CCIE #6280, 10x AWS certifications, Azure Solutions Architect Expert, GCP Professional Cloud Architect, NVIDIA Generative AI Associate, CKA, CKAD, and is a published Cisco Press author with 20+ years of enterprise experience across organizations such as London Stock Exchange Group (LSEG), Nomura, UBS, Barclays, AXA XL, and EnBW.

Mark is a Cisco Certified Systems Instructor (CCSI) with extensive experience delivering technical training for Global Knowledge and Systems & Network Training. He is developing an AI Bootcamp for Enterprise IT Professionals — a weekend intensive for experienced cloud architects, security engineers, and IT consultants.

Discuss AI security for your organisation →

Register interest in the AI Bootcamp →