Someone Built an Open-Source 'Theoretical Mythos' to Reverse-Engineer Anthropic's Most Dangerous AI

1 week ago 13

In brief

OpenMythos is simply a from-scratch reconstruction of the Claude Mythos architecture, built lone from nationalist probe papers and educated guesses.
Claude Mythos is Anthropic's astir almighty model, locked distant successful Project Glasswing due to the fact that it autonomously recovered 271 Firefox vulnerabilities and 32-step web attacks.
The repo is theoretical scaffolding—code without trained weights. It mirrors a abstracted effort by Vidoc Security that reproduced Mythos's vulnerability findings utilizing off-the-shelf models.

If Anthropic won't amusement you what's wrong its astir unsafe AI, idiosyncratic connected GitHub volition guess.

A developer named Kye Gomez has published OpenMythos, an open-source reconstruction of what helium thinks Claude Mythos looks similar nether the hood. The repo has picked up implicit 10,000 GitHub stars successful a fewer weeks upon release, and ships with an exhaustive “readme” record afloat of equations, citations, and a polite disclaimer that it has thing to bash with Anthropic.

It's speculation. But it's structured speculation, successful code.

Here’s a speedy refresher connected what Mythos is: Mythos leaked into nationalist presumption successful precocious March, erstwhile Anthropic accidentally published draught materials describing it arsenic the company's astir susceptible exemplary to date—a tier supra Opus. The follow-up, Mythos Preview, turned retired to beryllium unreleasably bully astatine cybersecurity.

Per Anthropic, Mythos recovered 271 vulnerabilities successful Firefox during Mozilla testing. It became the archetypal AI exemplary to implicit a 32-step firm web onslaught simulation. Anthropic locked it wrong Project Glasswing, a vetted conjugation of astir 40 partners, including Microsoft, Apple, Amazon, and the NSA.

The nationalist ne'er gets to interaction it. So Gomez tried to fig retired however it works.

OpenMythos's cardinal conjecture is that Mythos is simply a Recurrent-Depth Transformer—also called a looped transformer. Standard models stack hundreds of unsocial layers. Looped models instrumentality a smaller stack and tally it done itself galore times per guardant pass.

In different words, it’s the aforesaid weights going done much iterations. Deeper thinking, successful continuous latent space, earlier immoderate token gets emitted.

The repo argues this would explicate Mythos's 2 strangest qualities: It reasons done caller problems nary different exemplary tin crack, but its earthy memorization is uneven. That's the architectural fingerprint of looping—composition implicit storage.

OpenMythos cites Parcae, an April 2026 insubstantial from University of California San Diego and Together AI that solved the long-standing instability occupation successful looped models—a 770 million-parameter Parcae exemplary matches a 1.3 cardinal fixed-depth transformer connected quality, with predictable scaling laws for however galore loops to run. The repo besides borrows DeepSeek's Multi-Latent Attention to compress memory, and a Mixture-of-Experts setup to grip breadth crossed domains.

What it does not person is weights, truthful fundamentally it’s a method without an executor.

OpenMythos is theoretical. The codification defines exemplary variants from 1 cardinal to 1 trillion parameters, but you person to bid them yourself—the readme record points to a 3 cardinal parameter grooming publication connected FineWeb-Edu and a Chinchilla-adjusted 30 billion-token target, which is the benignant of compute measure that runs into hundreds of thousands of dollars connected H100s. Nobody's done it yet.

So wherefore does it matter?

Because it's the 2nd clip successful a period idiosyncratic has chipped astatine the partition astir Mythos. The archetypal was a survey from Vidoc Security, which reproduced respective of Mythos's astir alarming vulnerability findings utilizing GPT-5.4 and Claude Opus 4.6 wrong an open-source agent. No Glasswing access, and astatine nether $30 per scan. Different angle, aforesaid conclusion: The moat astir Mythos whitethorn beryllium thinner than the selling suggested.

OpenMythos and the Vidoc replication are doing antithetic jobs. Vidoc reproduced Mythos's outputs—the vulnerability discoveries themselves—using existing models. OpenMythos is trying to reproduce the architecture—the existent instrumentality that produces those outputs. One says you don't request Mythos to find the bugs Mythos found. The different says, eventually, you mightiness beryllium capable to physique thing similar Mythos yourself.

Anthropic astir surely doesn't stock Gomez's architectural guesses publicly, and respective of the plan choices successful OpenMythos are explicit hedges—the readme record makes definite to beryllium vague capable truthful users cognize this is conscionable an approach. It repeatedly says "likely," "suspected," and "almost certainly." Real Mythos whitethorn not beryllium a looped transformer astatine all. Or it mightiness beryllium 1 with details Gomez hasn't reverse-engineered yet.

What OpenMythos demonstrates is that the probe lit already contains astir of the pieces. Looped transformers, Mixture of Experts, Multi-Latent Attention, Adaptive Computation Time, Parcae's stableness fix—none of it is proprietary. The repo is, much than anything, an inventory of what's publically known astir however to physique a Mythos-class model.

The repo is licensed MIT, and it has 2,700 forks already. The grooming publication is sitting there, waiting for idiosyncratic with a GPU clump and a thesis to prove.