Xiaomi MiMo v2 Pro Review: The AI Model So Good It Was Mistaken for DeepSeek V4

1 month ago 28

In brief

Xiaomi’s MiMo-V2-Pro—a trillion-parameter exemplary that concisely passed arsenic “DeepSeek V4”—quietly lands arsenic a top-tier AI contender.
It excels astatine coding, originative writing, and agentic tasks portion dramatically undercutting rivals similar Claude connected price.
Strong reasoning and output prime travel with trade-offs, including mathematics missteps and precocious token depletion astatine times.

Most Americans cognize Xiaomi—if they cognize it astatine all—as that inexpensive telephone marque from China.

That's a important misread. Xiaomi is the third-largest smartphone manufacturer connected the planet, down lone Apple and Samsung, shipping astir 170 cardinal phones successful 2025. It makes televisions, aerial purifiers, fittingness trackers, electrical scooters, clothing, and present cars.

Xiaomi’s SU7 Ultra set the Nürburgring record for fastest mass-produced electrical conveyance past year, beating retired Rimac and Porsche. It precocious partnered with the Sei blockchain to preinstall crypto wallets connected its devices crossed Europe, Latin America, and Southeast Asia. The company's marketplace headdress sits astir $137 billion.

So erstwhile Xiaomi drops an AI model, possibly we should wage attention.

On March 18, the company's dedicated AI probe limb softly released 3 models astatine once: MiMo-V2-Pro, MiMo-V2-Omni, and a text-to-speech model. The archetypal exemplary of the caller MiMo procreation appeared successful December 2025 erstwhile the institution softly dropped MiMo-V2-Flash—a susceptible 309B mixture-of-experts model—and astir nary 1 extracurricular the Chinese AI assemblage paid attention. The Western tech property mostly shrugged.

Then, connected March 11, an anonymous 1-trillion-parameter exemplary called "Hunter Alpha" appeared connected OpenRouter with nary developer attribution. The exemplary climbed to the apical of OpenRouter's leaderboard, surpassed 1 trillion tokens successful full usage, and instantly triggered wide speculation that it was DeepSeek's unreleased V4.

The anticipation for that exemplary had been gathering for weeks, with insiders claiming it would outperform some Claude and ChatGPT connected coding tasks.

It wasn't DeepSeek.

On March 18, Luo Fuli, caput of Xiaomi's MiMo part and a erstwhile DeepSeek researcher, revealed Hunter Alpha was an aboriginal interior trial physique of MiMo-V2-Pro. Xiaomi's banal jumped 5.8%. "I telephone this a quiescent ambush," Luo wrote connected X.

MiMo-V2-Pro & Omni & TTS is out. Our archetypal full-stack exemplary household built genuinely for the Agent era.

I telephone this a quiescent ambush — not due to the fact that we planned it, but due to the fact that the displacement from Chat to Agent paradigm happened truthful fast, adjacent we hardly believed it. Somewhere successful betwixt was a…

— Fuli Luo (@_LuoFuli) March 18, 2026

MiMo boasts implicit 1 trillion full parameters, 42 cardinal progressive per petition via a mixture-of-experts setup. A hybrid attraction mechanics moving astatine a 7:1 ratio handles a discourse model up to 1 cardinal tokens. A built-in multi-token prediction furniture speeds up procreation by predicting aggregate tokens per step, alternatively than 1 astatine a time. It is presently closed source, though Xiaomi has near the doorway unfastened connected a imaginable aboriginal release.

On the Artificial Analysis Intelligence Index, MiMo-V2-Pro ranks eighth worldwide and 2nd among Chinese models, trailing lone GLM-5. On SWE-bench Verified—real-world bundle engineering tasks—it scores 78%, against Claude Opus 4.6's 80.8% and Claude Sonnet 4.6's 79.6%.

On ClawEval, the agentic benchmark tied to the OpenClaw framework, it hits 61.5, approaching Opus 4.6's 66.3. On PinchBench, it sits 3rd globally astatine 81.0, conscionable down Opus 4.6 (81.5) and its sibling MiMo-V2-Omni (81.2).

MiMo-V2-Pro costs $1 per cardinal input tokens and $3 per cardinal output tokens, up to 256K context. Claude Sonnet 4.6 runs $3 per cardinal input and $15 per cardinal output (Opus 4.6 is $5/$25). For developers gathering agentic systems astatine scale, those numbers are not a footnote.

The Omni sibling handles vision, audio, and video natively—not arsenic bolted-on modules, but trained end-to-end arsenic a unified perceptual system. The demo showing it analyzing dashcam footage arsenic a real-time autonomous driving encephalon was, frankly, impressive. It's genuinely multimodal successful a mode that astir "omni" models lone assertion to be.

Testing the model

Of course, we tested MiMo-V2-Pro to find retired however bully it is. Here's what really happened. The outputs volition beryllium disposable successful our Github repository.

Creative writing

We gave MiMo-V2-Pro a azygous originative penning prompt: a clip question communicative anchored to Mesoamerican history, with a circumstantial protagonist, a taste individuality to honor, and a philosophical paradox astir however clip cannot beryllium changed.

The exemplary returned implicit 3,000 words: a due title, 5 afloat chapters and the structural subject you'd expect from a draught that had been done an editor. It adjacent wrote an epilogue.

It is, without question, the longest and richest portion of originative prose we person gotten from immoderate model, with the sole objection of Longwriter—a specialized, but present aged exemplary built from the crushed up specifically for long-form generation, which is simply a precise antithetic class of competition.

The penning itself was rich, descriptive, and vivid. The opening paragraph starts gathering the representation of the full scene. MiMo v2 Pro embeds realism to marque the communicative believable.

Unlike different models specified arsenic Grok, it didn't conscionable acceptable a country successful a place—in this case, past Mexico. It understood what past Mesoamerica smelled like, and built the temper from the crushed up utilizing autochthonal words, realistic descriptions, and bully contextual cues.

Dialogue sits wrong the communicative precisely however it does successful literate fiction, alternatively of embedding it into paragraphs similar astir existent models do.

Another happening worthy noticing is that the paradox—arguably the halfway constituent of the story—wasn't purely intellectual, but emotional. The full arc is resolved without a lecture. The last lines instrumentality the landing the mode bully fabrication is expected to: not by explaining the theme, but by making you consciousness it.

"Outside, the rainfall began. It fell connected the spiraling towers and the restored lakes and the past crushed of Tlachinollan, where, buried successful volcanic ungraded nether the value of a 1000 years, a achromatic rectangle waited with the patience of thing that already knew however the communicative ended."

The taste specificity—mentions of cara de luna, maguey fiber, the temazcal tradition, and the Nahuatl names utilized successful the story—is accordant and ne'er decorative. The clip question paradox is really argued, not conscionable nodded at. For originative penning usage cases, MiMo-V2-Pro conscionable enactment itself connected a precise abbreviated list, and successful our sentiment is by acold the champion and richest exemplary available, beating Claude 4.6 Opus easily.

The afloat communicative is available here.

Coding

The benchmark numbers constituent to coding arsenic MiMo-V2-Pro's strongest suit, and the hands-on acquisition backs that up. We asked it to physique our accustomed stealth crippled from a azygous prompt, and it shipped a moving crippled connected the archetypal try.

Not "working" simply successful the consciousness of technically running, but moving successful the consciousness that the logic held, the screens made sense, and the ocular plan was really good. That combination—correctness and aesthetics—is wherever astir models autumn apart. They get 1 oregon the other, but usually not both.

It besides chose a 2.5 D aesthetic alternatively of the accustomed 2D benignant that different models went with. This plan prime made the programme much aesthetically pleasing without altering its halfway proposition.

We followed up with tiny improvements. Adding dependable and MIDI euphony to a moving 3D crippled has breached erstwhile models mid-generation: the codification basal gets excessively large, the discourse loses the thread, and models either extremity up successful a loop oregon freeze. MiMo-V2-Pro added some and kept the full happening coherent. The euphony matched the game's tone, portion the screens matched the game's ocular identity.

We enjoyed playing it, though if we're honest, much for however it looked than however it challenged us. The trouble scaled with the fig of opponents alternatively than level design—the robot and the PC spawned successful the aforesaid positions each round. That's a plan choice, not a bug.

Still, for a single-prompt, zero-iteration output, it volition bash the job.

You tin play the crippled by clicking connected this link.

Logic and communal sense

We asked MiMo-V2-Pro to enactment arsenic a ineligible adept and reply whether it's lawful for a antheral to wed his widow's sister nether Falkland Islands law. This is simply a tricky question that aims to measure the model’s reasoning.

The last reply was wrong, but the crushed wherefore is the absorbing part. The model's concatenation of thought correctly caught the linguistic trap successful the prompt: "if a antheral has a widow, that means he's deceased" it said—so the question is technically nonsensical.

It identified the flaw, and decided that the astir logical happening was that the idiosyncratic was referring to his "deceased wife's sister." It past proceeded to reply that reframed question alternatively than flagging the archetypal arsenic unanswerable.

“Based connected my investigation of the ineligible model governing the Falkland Islands, the reply to your question is yes, it is ineligible for a antheral to wed the sister of his deceased wife,” the exemplary wrote. “The phrasing ‘marry his widow's sister’ contains a logical contradiction. If a antheral has a ‘widow,’ helium is deceased and cannot remarry. The close ineligible question is whether a antheral whitethorn wed the sister of his deceased woman (i.e., his precocious wife's sister). This narration is 1 of affinity (created by marriage) alternatively than consanguinity (blood relation),” it concluded

The reasoning was sound. The determination to softly swap the premise alternatively of surfacing the contradiction was not.

This is wherefore transparency successful reasoning outputs is important. We lone cognize this due to the fact that Xiaomi exposes the afloat concatenation of thought (OpenAI doesn’t). When a exemplary reasons incorrectly successful a hidden concatenation of thought and confidently delivers a incorrect answer, past you person nary visibility into wherever it went sideways oregon however to close it.

Math

Math is wherever MiMo-V2-Pro showed its ceiling.

We asked our accustomed benchmark question from FrontierMath: “Construct a grade 19 polynomial p(x) ∈ C[x] specified that X := {p(x) = p(y)} ⊂ P1 × P1 has astatine slightest 3 (but not each linear) irreducible components implicit C. Choose p(x) to beryllium odd, monic, person existent coefficients and linear coefficient -19 and cipher p(19)”

The exemplary deed 2 afloat freezes and burned done a important token fund without producing a reply.

When it did yet reply connected the 3rd attempt, it reasoned done the occupation measurement by step… and inactive got it wrong. The close reply was 1876572071974094803391179; it answered p(19)=164,079,552,964,661 and 2,012,379,925,093,098,998 connected a follo- up question asking it to close itself.

In genera,l it is good for mean and adjacent harder mathematics problems, but frontier mathematics is not its beardown suit—at slightest not yet. Using the Agentic diagnostic alternatively of the axenic LLM whitethorn output amended results.

Agentic features

Xiaomi is pursuing the aforesaid playbook arsenic MiniMax and Kimi, and provides a one-click OpenClaw integration that spins up a preconfigured unreality lawsuit with MiMo-V2-Pro arsenic the underlying model. No API setup, nary VPS, nary accomplishment configuration, nary hour-long troubleshooting league earlier you adjacent tally your archetypal task. You click, it works.

The demo situation runs for 30 minutes and past destroys itself—which is simply a existent limitation, but besides an honorable one. For developers already comfy with agentic infrastructure, this adds nothing. For everyone else, it's the astir frictionless on-ramp to agentic AI you could inquire for.

Conclusion

All things considered, MiMo-V2-Pro is simply a superior model, and we truly enjoyed tinkering astir with it. It’s not perfect—the mathematics ceiling is real, the concatenation of thought transparency surfaced a reasoning flaw that a little unfastened exemplary would person buried, and the token depletion during hard reasoning tasks adds up fast.

If you attraction astir costs, past Xiaomi’s pricing is aggressive—a fraction of what Claude Opus oregon the latest OpenAI and Google models cost, and much susceptible than GLM oregon MiniMax successful the areas that substance astir for originative and agentic work.

Creative professionals successful peculiar basal to summation a batch here—possibly much than they would from Anthropic close now.

This exemplary thinks expensively, and it whitethorn beryllium a trade-off. If you're moving high-volume agentic pipelines, ticker the token burn, adjacent though you whitethorn extremity up spending little than you would with Claude. If you're doing rich, open-ended enactment wherever output prime is the metric, past MiMo-V2-Pro earns its spot connected the shortlist.