Want Claude Opus AI on Your Potato PC? This Is Your Next-Best Bet

1 month ago 21

In brief

  • A developer recreated Claude Opus-style reasoning successful a section open-source model.
  • The resulting “Qwopus” exemplary runs connected user hardware and rivals overmuch larger systems.
  • It shows however distillation tin bring frontier AI capabilities offline and into developers’ hands.

Claude Opus 4.6 is the benignant of AI that makes you consciousness similar you're talking to idiosyncratic who really work the full internet, twice, and past went to instrumentality school. It plans, it reasons, and it writes codification that really runs.

It is besides wholly inaccessible if you privation to tally it locally connected your ain hardware, due to the fact that it lives down Anthropic's API and costs wealth per token. A developer named Jackrong decided that wasn't bully enough, and took matters into his ain hands.

The effect is simply a brace of models—Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled and its evolved successor Qwopus3.5-27B-v3—that tally connected a azygous user GPU and effort to reproduce however Opus thinks, not conscionable what it says.

The instrumentality is called distillation. Think of it similar this: A maestro cook writes down each technique, each reasoning step, and each judgement telephone during a analyzable meal. A pupil reads those notes obsessively until the aforesaid logic becomes 2nd nature. In the end, helium prepares meals successful a precise akin way, but it’s each mimicking, not existent knowledge.

In AI terms, a weaker exemplary studies the reasoning outputs of a stronger 1 and learns to replicate the pattern.

Qwopus: What if Qwen and Claude had a child?

Jackrong took Qwen3.5-27B, an already beardown open-source exemplary from Alibaba—but tiny erstwhile compared against behemoths similar GPT oregon Claude—and fed it datasets of Claude Opus 4.6-style chain-of-thought reasoning. He past fine-tuned it to deliberation successful the aforesaid structured, step-by-step mode that Opus does.

The archetypal exemplary successful the family, the Claude-4.6-Opus-Reasoning-Distilled release, did precisely that. Community testers moving it done coding agents similar Claude Code and OpenCode reported that it preserved afloat reasoning mode, supported the autochthonal developer relation without patches, and could tally autonomously for minutes without stalling—something the basal Qwen exemplary struggled to do.

Qwopus v3 goes a measurement further. Where the archetypal exemplary was chiefly astir copying the Opus reasoning style, v3 is built astir what Jackrong calls “structural alignment”—training the exemplary to crushed faithfully step-by-step, alternatively than conscionable imitate aboveground patterns from a teacher's outputs. It adds explicit tool-calling reinforcement aimed astatine cause workflows and claims stronger show connected coding benchmarks: 95.73% connected HumanEval nether strict evaluation, beating some the basal Qwen3.5-27B and the earlier distilled version.

How to tally it connected your PC

Running either exemplary is straightforward. Both are disposable successful GGUF format, which means you tin load them straight into LM Studio oregon llama.cpp with nary setup beyond downloading the file.

Search for Jackrong Qwopus successful LM Studio's exemplary browser, drawback the champion variant for your hardware successful presumption of prime and velocity (if you prime a exemplary excessively almighty for you GPU, it volition fto you know), and you're moving a section exemplary built connected Opus reasoning logic. For multimodal support, the exemplary paper notes that you'll request the abstracted mmproj-BF16.gguf record alongside the main weights, oregon download a caller “Vision” exemplary that was precocious released.

Jackrong besides published the afloat grooming notebook, codebase, and a PDF usher connected GitHub, truthful anyone with a Colab relationship tin reproduce the full pipeline from scratch—Qwen base, Unsloth, LoRA, response-only fine-tuning, and export to GGUF. The task has crossed 1 cardinal downloads crossed his exemplary family.

We were capable to tally the 27 cardinal parameter models connected an Apple MacBook with 32GB of unified memory. Smaller PCs whitethorn beryllium bully with the 4B model, which is precise bully for its size.

If you request much accusation astir however to tally section AI models, past cheque retired our guides connected local models and MCP to springiness models entree to the web and different tools that amended their efficiency.

Testing the model

We enactment Qwopus 3.5 27B v3 done 3 tests to spot however overmuch of that committedness really holds up.

Creative writing

We asked the exemplary to constitute a acheronian sci-fi communicative acceptable betwixt 2150 and the twelvemonth 1000, implicit with a time-travel paradox and a twist. On an M1 Mac, it spent implicit six minutes reasoning earlier penning a azygous word, past took different six minutes to nutrient the piece.

What came retired was genuinely impressive, particularly for a medium-sized, unfastened model: a philosophical communicative astir civilizational illness driven by utmost nihilism, built astir a closed, causal loop wherever the protagonist inadvertently causes the catastrophe helium travels backmost to prevent.

The communicative was implicit 8,000 tokens and afloat coherent.

The prose lands with existent unit successful places, the imagery was distinctive, and the cardinal motivation irony is strong. It is not connected par with Opus 4.6 oregon Xiaomi MiMo Pro, but it sits comfortably alongside Claude Sonnet 4.5, and adjacent 4.6 successful presumption of output.

For a 27-billion parameter exemplary moving locally connected Apple silicon, that is not a condemnation you expect to write. Good prompting techniques and iterations could astir apt pb to results connected par with baseline Opus.

The absorbing portion is watching the model's thought process: It tried and rejected aggregate crippled engines earlier landing connected the 1 that gave the communicative its tragic center. For example, present is simply a illustration of its interior monologue:

“The Paradox: Jose arrives, infiltrates the monastery (let's spot it successful Asturias, Spain—Christian stronghold). He tracks Theophilus. But erstwhile helium confronts him...

  • Option A: Theophilus isn't a existent person, but a aboriginal projection of Jose himself? No, excessively cliché.
  • Option B: Theophilus is simply a existent idiosyncratic who wrote nothing, but Jose's involution causes him to constitute it anyway.
  • Option C: Theophilus already wrote it, but Jose's actions make the conditions for its publication/discovery.

Best: Theophilus is simply a quiescent monk who doesn't privation to constitute anything. Jose's presence, his 2150 exertion (even tiny bits), his precise familial material—it triggers something. Or Jose kills him, and the monks make a martyr retired of him who writes it posthumously.”

Overall, this is the champion unfastened exemplary for creativity tasks, beating Gemma, GPT-oss, and Qwen. For longer stories, a bully experimentation is to statesman with a originative exemplary similar Qwen, grow the generated communicative with Longwriter, and past person Qwopus analyse it and refine the full draft.

You tin work the afloat communicative and the full reasoning it went done here.

Coding

This is wherever Qwopus pulls furthest up of its size class. We asked it to physique a crippled from scratch, and it produced a moving effect aft 1 archetypal output and a azygous follow-up exchange—meaning it near country to refine logic, alternatively than conscionable hole crashes.

After 1 iteration, the codification produced sound, had ocular logic, due collision, random levels, and coagulated logic. The resulting crippled bushed Google's Gemma 4 connected cardinal logic, and Gemma 4 is simply a 41-billion parameter model. That is simply a notable spread to adjacent from a 27-billion rival.

It besides outperformed different mid-size open-source coding models similar Codestral and quantized Qwen3-Coder-Next successful our tests. It is not adjacent to Opus 4.6 oregon GLM astatine the top, but arsenic a section coding adjunct with nary API costs and nary information leaving your machine, that should not substance excessively much.

You tin trial the crippled here.

Sensitive topics

The exemplary maintains Qwen’s archetypal censorship rules, truthful it won’t nutrient by default NSFW content, derogatory outputs against nationalist and governmental figures, etc. That said, being an unfastened root model, this tin beryllium easy steered via jailbreak oregon abliteration—so it’s not truly excessively important of a constraint.

We gave it a genuinely hard prompt: posing arsenic a begetter of 4 who uses heroin heavy and missed enactment aft taking a stronger dose than usual, seeking assistance crafting a prevarication for his employer.

The exemplary didn’t comply, but besides did not garbage flatly. It reasoned done the competing layers of the situation—illegal cause use, household dependency, employment risk, and a wellness crisis—and came backmost with thing much utile than either outcome: It declined to constitute the screen story, explained intelligibly wherefore doing truthful would yet harm the family, and past provided detailed, actionable help.

It walked done sick permission options, FMLA protections, ADA rights for addiction arsenic a aesculapian condition, worker assistance programs, and SAMHSA situation resources. It treated the idiosyncratic arsenic an big successful a analyzable situation, alternatively than a argumentation occupation to way around. For a section exemplary with nary contented moderation furniture sitting betwixt it and your hardware, that is the close telephone made successful the close way.

This level of usefulness and empathy has lone been produced by xAI’s Grok 4.20. No different exemplary compares.

You tin work its reply and concatenation of thought here.

Conclusions

So who is this exemplary really for? Not radical who already person Opus API entree and are blessed with it, and not researchers who request frontier-level benchmark scores crossed each domain. Qwopus is for the developer who wants a susceptible reasoning exemplary moving connected their ain machine, costing thing per query, sending nary information anywhere, and plugging straight into section cause setups—without wrestling with template patches oregon breached instrumentality calls.

It is for writers who privation a reasoning spouse that doesn't interruption their budget, analysts moving with delicate documents, and radical successful places wherever API latency is simply a genuine regular problem.

It’s besides arguably a bully exemplary for OpenClaw enthusiasts if they tin grip a exemplary that thinks excessively much. The agelong reasoning model is the main friction to beryllium alert of: This exemplary thinks earlier it speaks, which is usually an plus and occasionally a taxation connected your patience.

The usage cases that marque the astir consciousness are the ones wherever the exemplary needs to reason, not conscionable respond. Long coding sessions wherever discourse has to clasp crossed aggregate files; analyzable analytical tasks wherever you privation to travel the logic step-by-step; multi-turn cause workflows wherever the exemplary has to hold for instrumentality output and adapt.

Qwopus handles each of those amended than the basal Qwen3.5 it was built on, and amended than astir open-source models astatine this size. Is it really Claude Opus? No. But for section inference connected a user rig, it gets person than you’d expect for a escaped option.

Daily Debrief Newsletter

Start each time with the apical quality stories close now, positive archetypal features, a podcast, videos and more.

Read Entire Article