Mistral AI Drops New Open-Source Model. The Internet Is Not Impressed, Except for One Thing

2 weeks ago 16

In brief

Mistral Medium 3.5 is simply a 128 cardinal parameter dense exemplary priced astatine $1.50 input / $7.50 output per cardinal tokens, acold supra comparable Chinese alternatives.
Chinese open-source models—Qwen, GLM, MiMo-V2—dominate the leaderboard top, leaving Mistral arsenic a lonely Western holdout.
Mistral is positioning the merchandise arsenic a gathering artifact toward a aboriginal ample flagship model.

Mistral AI dropped Mistral Medium 3.5 connected April 29. The Paris-based laboratory announced a dense 128-billion-parameter model, a acceptable of agentic features—and walked consecutive into a partition of online “meh” reactions.

The merchandise came successful 3 parts. First, the exemplary itself. Second, distant coding agents via Mistral Vibe CLI—cloud-based coding sessions that tin propulsion propulsion requests to GitHub and tally successful parallel without you sitting astatine a terminal. Third, Work Mode successful Le Chat, Mistral's ChatGPT-style user interface, which present handles multi-step autonomous tasks similar email triage, probe synthesis, and cross-tool workflows.

Big ambitions, but a messy benchmark reality.

Medium 3.5 scores 77.6% connected SWE-Bench Verified—a coding benchmark that tests whether a exemplary tin hole existent GitHub issues by generating moving patches. It besides hits 91.4% connected τ³-Telecom, which measures agentic instrumentality usage successful specialized environments. Mistral besides merged 3 antecedently abstracted models (Medium 3.1, Magistral, and Devstral 2) into 1 acceptable of weights with configurable reasoning effort per request.

Unified exemplary replacing 3 is simply a existent engineering win. The occupation is what it costs and who it's up against.

Mistral charges $1.50 per cardinal input tokens and $7.50 per cardinal output tokens. Alibaba's Qwen 3.6 astatine 27 cardinal parameters—less than a 4th of Medium 3.5's parameter count—scores 72.4% connected the aforesaid SWE-Bench Verified benchmark and ships nether Apache 2.0, meaning you tin download and tally it for free.

Did you know?

Parameters are what find an AI's capableness to learn, reason, and store information. The much parameters, the wider the model's breadth of knowledge.

Scroll done the open-source leaderboards and the representation is stark. The apical spots beryllium to Alibaba’s Qwen, GLM from China's Zhipu AI, and MiMo-V2 from Xiaomi, each of them cheaper, much almighty and competitory than Mistral’s caller release. Medium 3.5 hasn't adjacent ranked connected large autarkic leaderboards yet—third-party evaluations are inactive pending.

The lone bully happening though, arsenic immoderate argue, is that Mistral is, astatine this point, the lone non-Chinese exemplary with immoderate superior beingness successful the open-source conversation.

I deliberation Mistral has the 10th highest valuation successful the full AI country (something similar that).

All portion they consistently merchandise immoderate of the worst models.

They person survived done European bureaucracy, lobbying and politics.

All due to the fact that they’ve convinced demented bureaucrat… https://t.co/kh7ASvdi7C

— Youssof Altoukhi (@Youssofal_) April 29, 2026

The Internet reacts

Pedro Domingos, a instrumentality learning prof astatine the University of Washington, wasn't gentle:

"Regular AI companies brag astir however overmuch amended their exemplary is connected benchmarks. Only Mistral brags astir however overmuch worse its 1 is."

Regular AI companies brag astir however overmuch amended their exemplary is connected benchmarks. Only Mistral brags astir however overmuch worse its 1 is. pic.twitter.com/WcAKskaVpL

— Pedro Domingos (@pmddomingos) April 30, 2026

He followed up with a sharper question: "I don't cognize what's worse, for Europe to not beryllium successful the AI contention oregon for it to beryllium represented by a laughingstock similar Mistral."

Youssof Altoukhi, laminitis of Yoyo Studios, did the math: Qwen 3.6, astatine 27 cardinal parameters, is 4.7 times smaller than Medium 3.5 and scores comparably connected coding. Medium 3.5's output pricing puts it alongside closed models that people importantly higher connected each large benchmark.

“If it wasn’t for their governmental accomplishment they would person been bankrupt by now,” helium said.

Not everyone was purely dismissive. AI developer Michal Langmajer captured the ambivalence:

"I'm genuinely gladsome there's inactive a non-US, non-Chinese laboratory trying to physique frontier LLMs but lad we person to level up the crippled successful Europe. Their caller flagship exemplary is fundamentally 'not the best' connected immoderate benchmark, yet costs aggregate times much than astir competitors."

I’m genuinely gladsome there’s inactive a non-US, non-Chinese laboratory trying to physique frontier LLMs (@MistralAI) but lad we person to level up the crippled successful Europe.

Their caller flagship exemplary is fundamentally “not the best” connected immoderate benchmark, yet costs aggregate times much than astir competitors... pic.twitter.com/JwvR5eKWmT

— Michal Langmajer (@MichalLangmajer) April 30, 2026

Some developers argued unfastened weights are a durability play, not a leaderboard play. A exemplary anyone tin download, fine-tune, and self-host doesn't request to triumph rankings contiguous to enactment relevant. Others pointed to Mistral's existent endeavor deployments crossed Europe arsenic grounds the moat isn't purely technical.

The Geopolitical information net

This is wherever Mistral's existent transportation lives.

European enterprises nether GDPR, banks handling delicate lawsuit data, and governments that won't way AI workloads done Chinese infrastructure person constricted options. As Decrypt reported past December, HSBC signed a multi-year woody with Mistral specifically to self-host models connected its ain infrastructure. The entreaty of an EU-headquartered open-weight laboratory with a $14 cardinal valuation doesn't amusement up successful benchmark tables—but it shows up successful procurement decisions.

Not the champion astatine coding, and not the cheapest. But it is: not American, not Chinese, auditable, self-hostable, and legally harmless for European enterprise.