Claude Opus 4.6 Found 22 Firefox Bugs in Two Weeks

Twenty minutes. That's how long Claude Opus 4.6 needed to spot its first Firefox vulnerability. By the end of two weeks, Anthropic's AI had uncovered 22 security flaws in Mozilla's browser — 14 of them high-severity bugs that could crash systems or leak memory. These discoveries represent nearly one-fifth of all critical Firefox patches released in 2025, achieved by an AI that read source code like a seasoned security researcher.

🔍 How AI Became a Security Researcher

Cybersecurity just crossed a threshold. We're not talking about automated scanners running predetermined checks. Claude Opus 4.6 analyzed Firefox's source code the way a human expert would — reading through functions, understanding logic flows, spotting patterns that shouldn't exist.

Anthropic's Frontier Red Team chose Firefox deliberately. This browser powers millions of users worldwide, backed by Mozilla's security-conscious development team. If an AI could find unknown vulnerabilities in such battle-tested software, the implications stretch far beyond one browser.

The AI didn't rely on traditional fuzzing tools. Instead, it performed static code analysis across nearly 6,000 C++ files, understanding context and relationships between different code sections. Think of it as having a tireless security expert who never gets bored reading through endless lines of code.

20 minutes to first vulnerability

6,000 C++ files scanned

22 new vulnerabilities found

🎯 The Findings: CVE-2026-2796 and Critical Firefox Bugs

The most striking discovery came within those first 20 minutes. Claude Opus 4.6 identified a Use After Free vulnerability in Firefox's JavaScript engine — later assigned CVE-2026-2796 with a CVSS score of 9.8. That's critical territory, the kind of flaw that lets attackers execute arbitrary code on victim machines.

Anatomy of the Use After Free Bug

The vulnerability lurked in SpiderMonkey, Firefox's JavaScript engine. Specifically, in how the engine handled functions wrapped through Function.prototype.call.bind() within WebAssembly modules. A call_ref instruction without proper bounds checking created a dangling memory pointer.

What does this mean practically? An attacker could read sensitive information from memory or potentially replace function pointers, leading to remote code execution. Claude didn't just spot the bug — it proposed a fix that Anthropic researchers verified before submitting to Mozilla.

«By the end of this effort, we had scanned nearly 6,000 C++ files and submitted a total of 112 unique reports»
Anthropic Frontier Red Team

⚡ The Methodology: Task Verifiers and Real-Time Testing

The secret sauce was what Anthropic calls "task verifiers" — tools that let Claude check its own work in real-time. Writing theoretical exploit code isn't enough. You need to test it, see if it crashes the browser, analyze memory dumps for anomalies.

Claude operated like an experienced penetration tester. It wrote test cases for suspicious functions, executed the code, parsed crash logs for memory corruption signs. When something looked wrong, it dove deeper into the source code for confirmation.

The Numbers Tell the Story

Mozilla confirmed the findings immediately. Of the 22 vulnerabilities, 14 rated as high severity, 7 as medium, and 1 as low. Most were patched in Firefox 148.0, released last month. The fact that these represent nearly 20% of all critical patches in 2025 demonstrates the approach's effectiveness.

Use After Free

Critical JavaScript engine vulnerability allowing memory information disclosure

WebAssembly JIT

Just-in-time compilation errors leading to miscompilation vulnerabilities

🛡️ From Discovery to Exploitation: AI's Current Limits

Here's where the story gets interesting. Anthropic spent roughly $4,000 in API credits testing something crucial: could Claude convert its vulnerability discoveries into working exploits?

The answer offers hope for defenders. Out of hundreds of attempts, the AI successfully created exploits in only two cases — and that was in a clean lab environment, without the security features that protect real Firefox installations.

Why This Matters

The cost of finding vulnerabilities proved orders of magnitude cheaper than weaponizing them. Defenders have an advantage — for now. AI can spot problems quickly, but struggles to turn them into actual weapons.

However, Anthropic warns this gap won't last forever. "Looking at the rate of progress, it is unlikely that the gap between frontier models' vulnerability discovery and exploitation abilities will last very long."

🚀 Claude Code Security: Automated Patching on the Horizon

The company didn't stop at discovery. Claude Code Security, currently in limited preview, aims to automate the patching process. The concept sounds simple — an AI agent that finds bugs and writes code to fix them. The execution is far more complex.

Early tests show promising results. The system verifies two critical elements: that the vulnerability is actually eliminated and that program functionality remains intact. It's not perfect — Anthropic recommends the same scrutiny you'd apply to any external contributor's patch — but the speed is remarkable.

Practical Tip: If you manage open-source projects or enterprise software, Mozilla recommends three key criteria for trusting AI-powered security reports: reproducible bugs, adequate problem documentation, and proposed solutions with clear impact explanations.

🌐 The Bigger Picture: Linux Kernel and Beyond

Firefox wasn't the only target. Anthropic reports that Claude Opus 4.6 has identified vulnerabilities in the Linux kernel as well — raising questions about the security of every major open-source project. If an AI can scan thousands of code files in days instead of months, traditional cybersecurity approaches need rethinking.

What makes the Anthropic-Mozilla collaboration special is its transparency. Mozilla published detailed triage information, helping other teams adapt their own practices. The era of "security through obscurity" is ending — we need collective defense.

⚠️ The Challenges Ahead

Despite the excitement, serious questions remain. What happens when AI models become equally skilled at creating exploits as finding vulnerabilities? Anthropic has internal safeguards, but what about other companies that might be less careful?

There's also the access issue. Right now, Claude Code Security is available in limited preview. Only major players have access to such tools, creating a new kind of "security divide." Smaller companies and indie projects might be left exposed.

Meanwhile, the speed of vulnerability discovery will create new pressures on development teams. If an AI can find 22 bugs in two weeks, how fast must developers move on patches? Mozilla handled Firefox 148 well, but not all teams have the same resources.

2026 started with a clear promise: AI tools will become integral to cybersecurity. The question isn't whether this will happen, but how quickly we'll adapt. Companies that embrace this change first — with proper safeguards and Mozilla-level transparency — will have significant advantages. The rest risk finding themselves playing a game where the rules change faster than they can follow.

Claude Opus Firefox AI Security Vulnerability Detection Mozilla Anthropic Browser Security Cybersecurity AI Bug Discovery Automated Testing

Sources:

How Claude Opus 4.6 Discovered 22 Critical Firefox Security Vulnerabilities in Just Two Weeks