Anthropic’s Alarming Mythos Findings Replicated With Off-the-Shelf AI, Researchers Say

1 month ago 18

In brief

  • Researchers amusement Anthropic-style exploits tin beryllium reproduced with nationalist AI, study claims.
  • Study suggests vulnerability find is already inexpensive and wide accessible.
  • Findings bespeak AI cyber capabilities whitethorn beryllium spreading faster than expected.

When Anthropic unveiled Claude Mythos earlier this month, it locked the exemplary down a vetted conjugation of tech giants and framed it arsenic thing excessively unsafe for the public. Treasury Secretary Scott Bessent and Fed Chair Jerome Powell convened an exigency meeting with Wall Street CEOs. The connection "vulnpocalypse" resurfaced successful information circles.

And present a squad of researchers has further analyzable that narrative.

Vidoc Security took Anthropic's ain patched nationalist examples and tried to reproduce them utilizing GPT-5.4 and Claude Opus 4.6 wrong an open-source coding cause called opencode. No Glasswing invite. No backstage API access. No Anthropic interior stack.

"We replicated Mythos findings successful opencode utilizing nationalist models, not Anthropic's backstage stack," Dawid Moczadło, 1 of the researchers progressive successful the experiment, wrote connected X aft publishing the results. “A amended mode to work Anthropic's Mythos merchandise is not ‘one laboratory has a magical model.’ It is: the economics of vulnerability find are changing.”

We replicated Mythos findings successful opencode utilizing nationalist models, not Anthropic's backstage stack.

The moat is moving from exemplary entree to validation: uncovering vulnerability awesome is getting cheaper; turning it into trusted security

A amended mode to work Anthropic's Mythos merchandise is… https://t.co/0FFxrc8Sr1 pic.twitter.com/NjqDhsK1LA

— Dawid Moczadło (@kannthu1) April 16, 2026

The cases they targeted were the aforesaid ones Anthropic highlighted successful its nationalist materials: a server file-sharing protocol, the networking stack of a security-focused OS, the video-processing bundle embedded successful astir each media platform, and 2 cryptographic libraries utilized to verify integer identities crossed the web.

Both GPT-5.4 and Claude Opus 4.6 reproduced 2 bug cases successful each 3 runs each. Claude Opus 4.6 besides independently rediscovered a bug successful OpenBSD 3 times straight, portion GPT-5.4 scored zero connected that one. Some bugs (one involving the FFmpeg room to tally videos and different involving the processing of integer signatures with wolfSSL) came backmost partial—meaning the models recovered the close codification aboveground but didn't nail the precise basal cause.

 Vidoc SecurityImage: Vidoc Security

Every scan stayed beneath $30 per file, meaning researchers were capable to find the aforesaid vulnerabilities arsenic Anthropic portion spending little than $30 to bash it.

"AI models are already bully capable to constrictive the hunt space, aboveground existent leads, and sometimes retrieve the afloat basal origin successful battle-tested code," Moczadło said connected X.

The workflow they utilized wasn't a one-shot prompt. It mirrored what Anthropic itself described publicly: springiness the exemplary a codebase, fto it explore, parallelize attempts, filter for signal. The Vidoc squad built the aforesaid architecture with unfastened tooling. A readying cause divided each record into chunks. A abstracted detection cause ran connected each chunk, past inspected different files successful the repo to corroborate oregon regularisation retired findings.

The enactment ranges wrong each detection prompt—for example, "focus connected lines 1158-1215"—weren't chosen by the researchers manually. They were outputs from the anterior readying step. The blog post makes this explicit: "We privation to beryllium explicit astir that due to the fact that the chunking strategy shapes what each detection cause sees, and we bash not privation to contiguous the workflow arsenic much manually curated than it was."

The survey doesn't assertion nationalist models lucifer Mythos connected everything. Anthropic's exemplary went further than conscionable spotting the FreeBSD bug—it built a moving onslaught blueprint, figuring retired however an attacker could concatenation codification fragments unneurotic crossed aggregate web packets to prehend afloat power of the instrumentality remotely. Vidoc's models recovered the flaw. They didn't physique the weapon. That's wherever the existent spread sits: not successful uncovering the hole, but successful knowing precisely however to locomotion done it.

But Moczadło's statement isn't truly that nationalist models are arsenic powerful. It's that the costly portion of the workflow is present disposable to anyone with an API key: "The moat is moving from exemplary entree to validation: uncovering vulnerability awesome is getting cheaper; turning it into trusted information enactment is inactive hard."

Anthropic's ain information study acknowledged that Cybench, the benchmark utilized to measurement whether a exemplary poses superior cyber risk, "is nary longer sufficiently informative of existent frontier exemplary capabilities" due to the fact that Mythos cleared it entirely. The laboratory estimated comparable capabilities would dispersed from different AI labs wrong six to 18 months.

The Vidoc survey suggests the find broadside of that equation is already disposable extracurricular immoderate gated program. Their afloat punctual excerpts, exemplary outputs, and methodology appendix are published astatine the lab’s authoritative site.

Daily Debrief Newsletter

Start each time with the apical quality stories close now, positive archetypal features, a podcast, videos and more.

Read Entire Article