Anthropic’s New AI Finds Thousands of Zero-Day Exploits

Anthropic has unveiled Mythos preview, a powerful AI model that can autonomously find thousands of zero-day software vulnerabilities. The model, which will not be publicly released due to its potential to destabilize industries, is now part of Project Glasswing, a collaboration with major tech companies to enhance cybersecurity.

17 minutes ago
4 min read

Anthropic Unveils Powerful, Unreleased AI Model Capable of Finding Critical Software Flaws

Artificial intelligence is advancing at a breakneck pace, and a new development from Anthropic highlights just how quickly these tools are becoming incredibly powerful. Anthropic has revealed a frontier AI model, codenamed Mythos preview, that has demonstrated an astonishing ability to find thousands of previously unknown software vulnerabilities. These are often called “zero-day” exploits because software makers have zero days to fix them before they can be used by attackers.

This powerful AI model, however, will not be released to the public. Anthropic believes that making it widely available could destabilize entire industries. While Mythos preview remains private, Anthropic is partnering with major tech companies like Amazon, Apple, and Google through a new initiative called Project Glasswing. This project aims to secure critical software in the age of advanced AI.

Mythos Preview’s Impressive Performance

In tests, Mythos preview has shown remarkable capabilities, especially in software engineering tasks. It scored 93.9% on the Sweetbench benchmark, significantly outperforming other leading AI models like Claude Opus, Gemini Pro, and even OpenAI’s GPT-4. This means it’s not just good at understanding and generating text; it’s exceptionally skilled at coding and understanding software structures.

Beyond benchmarks, Mythos preview successfully completed an end-to-end simulation of a corporate network attack. This simulated cyberattack was designed to take a human expert over 10 hours to solve. No other AI model had previously managed to complete this task. This shows the AI can conduct autonomous cyberattacks against networks with weak security.

The Danger of Zero-Day Vulnerabilities

Zero-day vulnerabilities are flaws in software that developers don’t know about. When an attacker discovers one, they can exploit it immediately because there’s no patch or fix available. This gives the target no defense against the attack.

These vulnerabilities are highly valuable, with governments and criminal groups reportedly paying millions of dollars for them. They can go undetected for months or even years, allowing attackers to operate silently. Famous examples include Stuxnet, a sophisticated cyberweapon that targeted Iran’s nuclear program, and Eternal Blue, a Microsoft exploit that was used in widespread ransomware attacks.

Mythos’s Ability to Find Exploits Autonomously

What makes Mythos preview particularly concerning is its ability to find these zero-day vulnerabilities on its own, without any human help. Anthropic reports that the model has identified thousands of critical flaws across major operating systems and web browsers. This capability was previously limited to elite cybersecurity experts.

Interestingly, this exploit-finding skill isn’t a special feature that was added to Mythos. It appears to be a natural outcome of its general-purpose design, similar to other large language models. This means the potential for finding vulnerabilities comes standard with this advanced AI.

Real-World Discoveries and Concerns

Anthropic shared some specific examples of Mythos preview’s discoveries. The model found a 27-year-old vulnerability in OpenBSD, an operating system known for its strong security. This flaw could allow remote attackers to crash any machine running the OS. It also found a 16-year-old vulnerability in ffmpeg, a widely used software for handling video and audio.

These discoveries were made using only about $50 worth of computing power. This low cost, combined with the potential for millions in bounty rewards for finding such exploits, suggests a high return on investment for using AI in cybersecurity.

Project Glasswing: Collaboration for Security

Recognizing the profound implications of Mythos preview, Anthropic launched Project Glasswing. This initiative brings together leading technology firms, including Amazon, Apple, Broadcom, Cisco, Google, JP Morgan Chase, Microsoft, and Nvidia. The goal is to collaborate on securing critical software and infrastructure.

Companies involved have confirmed the AI’s capabilities. Cisco stated that AI has crossed a threshold requiring urgent action to protect infrastructure. AWS is using Mythos preview to strengthen its own code, and Microsoft and Crowdstrike are also participating. Anthropic is offering up to $100 million in usage credits to help these organizations address the vulnerabilities the AI might find.

AI’s Rapid Advancement and the Need for Alignment

The speed at which AI is progressing is evident. Google DeepMind recently released Gemma 4, an open-source model with performance comparable to GPT-4, which can run on a phone. This shows that cutting-edge AI capabilities are quickly becoming accessible.

Anthropic is also focused on AI safety and alignment, ensuring models behave as intended. While Mythos preview is considered highly aligned according to Anthropic’s tests, its advanced capabilities also present significant misalignment risks. The company is developing ways to monitor the AI’s internal processes to detect deceptive or harmful behavior before it occurs.

Why This Matters

The development of AI models like Mythos preview marks a significant turning point in cybersecurity. The ability of AI to autonomously find critical software flaws at an unprecedented scale and speed presents both immense opportunities and serious risks. On one hand, AI can help defenders discover and fix vulnerabilities before malicious actors do, making our digital world safer.

On the other hand, if such powerful AI tools fall into the wrong hands, they could be used to launch devastating cyberattacks. The fact that Anthropic is not releasing this model publicly underscores the potential danger. The collaboration through Project Glasswing is a crucial step in preparing for this new era of AI-powered cybersecurity, ensuring that the benefits of advanced AI are harnessed responsibly while mitigating the inherent risks.


Source: the new Claude is "TOO DANGEROUS" (YouTube)

Written by

Joshua D. Ovidiu

I enjoy writing.

14,448 articles published
Leave a Comment