Xiaomi's New MiMo 2.5 Pro AI Can See, Hear, and Act—All in One Model

3 weeks ago 14

In brief

Xiaomi unveiled MiMo-V2.5 and V2.5-Pro, combining text, image, audio, and video capabilities into a azygous multimodal AI model.
The Pro mentation rivals apical frontier models successful coding and agentic benchmarks, portion importantly improving token ratio and cost.
The caller models people Xiaomi’s accelerated AI push, with open-source plans and assertive iteration pursuing beardown adoption connected platforms similar OpenRouter.

Xiaomi conscionable launched a caller AI exemplary family. Again.

A fewer weeks ago, the institution dropped MiMo-V2-Pro—a trillion-parameter exemplary that had been softly circulating connected OpenRouter nether the alias "Hunter Alpha" earlier Xiaomi revealed its identity. It went from anonymous to top-tier overnight. We tested it, and it was impressive.

Now Xiaomi is backmost with MiMo-V2.5 and MiMo-V2.5-Pro, a two-model household that adds thing the erstwhile procreation ne'er had successful a azygous package: eyes, ears, and the quality to process video. Oh, and the institution plans to unfastened source the models successful the adjacent future.

The V2-Pro was text-and-code only. Multimodal capableness existed successful its sibling model, MiMo-V2-Omni, but that was a abstracted merchandise astatine little benchmark scores. MiMo-V2.5 collapses each of that into 1 model—faster, much capable, and with autochthonal image, video, and audio knowing baked successful from the start.

That matters much than it mightiness dependable for regular users. For example, present you tin upload a photograph of your fridge and inquire it to suggest meal recipes. Drop successful a video tutorial and get a step-by-step summary. Record a gathering and person it propulsion retired enactment items. All successful 1 place, without juggling abstracted tools and abstracted models with antithetic pricing strategies.

Xiaomi claims MiMo-V2.5-Pro represents "a large leap from MiMo-V2-Pro successful wide agentic capabilities, analyzable bundle engineering, and long-horizon tasks," and says it present matches frontier models similar Claude Opus 4.6 and GPT-5.4 crossed astir coding and cause benchmarks. The numbers mostly backmost that up—with immoderate gaps inactive disposable connected harder reasoning tasks.

The basal and pro models service antithetic purposes. MiMo-V2.5-Pro is the dense lifter. Xiaomi says it tin "autonomously implicit nonrecreational tasks involving 1,000+ instrumentality calls, enactment that would instrumentality quality experts days." That's for developers moving complex, multi-step automated workflows. It runs astatine 60–80 tokens per 2nd and costs $1.00 input / $3.00 output per cardinal tokens.

MiMo-V2.5 is the mundane version. Faster (100–150 tokens per second), cheaper ($0.40 input / $2.00 output), and supports each modalities—image, audio, and video that the Pro-only tier skips. Both models transportation a 1M-token discourse window, meaning they tin clasp astir 750,000 words successful a azygous conversation.

On SWE-bench Pro—a coding benchmark wherever models hole existent bugs successful existent startup codebases, scored arsenic a walk complaint retired of 100—MiMo-V2.5-Pro resolves 57.2% of tasks. That's adjacent the apical of the field; the mean exemplary manages astir 25%. The communicative is akin connected τ3-bench and ClawEval, wherever it lands wrong a fewer points of Claude Opus 4.6 and GPT-5.4. The spread opens up connected Humanity's Last Exam, a gauntlet of graduate-level problems crossed dozens of world fields: MiMo scores 48.0% versus GPT-5.4's 58.7—a 10-point shortage that's hard to insubstantial over..

Where it genuinely stands retired is token efficiency. Xiaomi says MiMo-V2.5-Pro uses 42% less tokens than Kimi K2.6 astatine equivalent benchmark scores, and MiMo-V2.5 uses astir fractional the tokens of Muse Spark for akin results. For anyone moving these astatine scale—developers processing thousands of requests daily—that quality is existent money.

On multimodal tasks, MiMo-V2.5 scores amusement results that enactment it connected par with GPT/5.4 and Gemini 3.1 Pro, and are rather adjacent to Opus 4.6 standards.

Since December 2025, Xiaomi has completed 3 large exemplary releases: First, it released its businesslike MiMo-V2-Flash, past the V2-Pro/Omni/TTS trio successful March, and present the V2.5 bid today. The institution committed astatine slightest $8.7 cardinal successful AI investment implicit the adjacent 3 years, announced by CEO Lei Jun the time aft V2-Pro launched—and the merchandise cadence suggests that the fund is already moving.

Context besides helps explicate the speed. According to Digital Applied, arsenic of aboriginal April, Xiaomi's models accounted for astir 21% of each postulation connected OpenRouter—growing implicit 42% successful the past 7 days. When your erstwhile exemplary becomes 1 of the astir competitory models successful the world's largest AI routing platform, you person some the resources and the unit to iterate fast.

This was astir apt owed to the roar of the agentic AI instrumentality Hermes and its statement with Xiaomi, giving users escaped entree to MiMo v2 Pro for a constricted time. That timeframe is already closed, but the hype was capable to enactment Xiaomi successful the crippled field.

Those who privation to usage Hermes for escaped present tin trial the caller Step 3.5 flash with the Nous API oregon usage OpenRouter with escaped models but much constricted usage.

Token program pricing besides got a refresh. MiMo-V2.5 runs astatine a 1x recognition rate; MiMo-V2.5-Pro astatine 2x. Xiaomi is nary longer charging an other multiplier for utilizing the afloat 1 million-token discourse window, which makes long-document investigation noticeably cheaper. Existing users besides get a afloat recognition reset arsenic a motorboat bonus.

Xiaomi says the exemplary is disposable successful its AI Studio. We tried to entree it determination instantly aft launch—no luck. It is, however, already unrecorded via the Xiaomi MiMo API, which is wherever astir developers volition really usage it.

The institution says it's already grooming the adjacent generation, with "deeper reasoning, tighter instrumentality integration, and richer real-world grounding." At the complaint Xiaomi is moving, that announcement is astir apt person than you'd expect.