Anthropic on May 22 published the first quantitative progress report for Project Glasswing, its defensive AI initiative, showing that the unreleased Claude Mythos Preview model—together with roughly 50 partner organizations—had identified more than 10,000 high- or critical-severity vulnerabilities across systemically important software. The figure immediately captured attention but spawned inaccurate claims that a single model had found 10,000 critical bugs in open source. The real story is more revealing: the bottleneck has shifted from a scarcity of findings to a capacity crunch in verifying, responsibly disclosing, and patching them.
The update supplies the most concrete public evidence to date that frontier AI can dramatically accelerate vulnerability discovery at a scale the existing patch-and-disclose infrastructure cannot match. For operators, the implication is stark: the software that underpins critical systems is becoming more transparently insecure before it can be hardened.
Anthropic launched Project Glasswing on April 7, giving 11 named launch partners—including AWS, Apple, Google, JPMorganChase, Microsoft, and the Linux Foundation—and more than 40 other organizations access to Mythos Preview, together with $100 million in usage credits and $4 million in donations. The May 22 update, which also launched a coordinated vulnerability disclosure dashboard, reported that across all partner engagements the model surfaced more than 10,000 high- or critical-severity issues—a mix of internal first-party code, partner infrastructure, and open-source projects.
Separate from that aggregate, Anthropic's own open-source scanning effort examined more than 1,000 projects and flagged 23,019 total findings, of which 6,202 were rated high or critical by the model. Of 1,752 high- and critical-severity findings reviewed by humans, 1,587 were true positives, and 1,094 were confirmed as genuinely high or critical. The company projected about 3,900 confirmed high-critical open-source vulnerabilities if validation rates held. Yet as of the update, only 530 high-critical open-source vulnerabilities had been disclosed to maintainers; 75 had been patched and 65 had public advisory records out of 1,596 total disclosures across all severities.
The most concrete independent confirmations came from Mozilla and wolfSSL. Mozilla incorporated fixes for 271 Mythos-identified vulnerabilities into Firefox 150—180 rated high, 80 moderate, and 11 low. wolfSSL credited the model with generating eight CVEs that triggered the wolfSSL 5.9.1 release; Anthropic highlighted a certificate-forgery issue (CVE-2026-5194). Cloudflare, which used Mythos Preview on more than 50 repositories, described the model's reasoning and proof generation as a genuine step forward. Anthropic attributed 2,000 total bugs and 400 high- or critical-severity findings to Cloudflare's engagement; Cloudflare's own public blog did not independently repeat those numbers.
Anthropic's public stance is unequivocal: "> The limiting factor is no longer finding vulnerabilities; it is verifying, responsibly disclosing, and patching them." Microsoft, Oracle, and Palo Alto Networks have all acknowledged that AI-assisted discovery is swelling identification and patch volumes. That mismatch is now the defining challenge.
Claude Mythos Preview remains a gated research preview available only to Glasswing participants. It is not publicly released, and its technical capabilities are still being assessed. An independent evaluation by XBOW noted strengths in reasoning and proof-of-concept generation but also flagged limitations requiring tooling to reach fully automated offensive parity. The UK AI Security Institute is separately examining the pace of autonomous cyber capability.
What remains unclear is how many of the 10,000-plus partner-wide vulnerabilities have been independently validated or patched; the aggregate number mixes codebases and verification standards with no public breakdown. Overlaps between partner findings and the open-source dashboard are unknown, and the false-positive rate across all partner engagements is not disclosed. Maintainer reactions—whether they are overwhelmed or questioning quality—have not been systematically surveyed, and regulatory responses are only beginning.
The Glasswing update makes clear that the security community now operates in a regime where AI can find flaws faster than organizations can fix them. That shift will force enterprises and policymakers to decide how much transparency the ecosystem can absorb and whether patch-and-disclose processes must be redesigned for a world of abundant findings.
