Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124


Joining the ranks of a growing number of smaller and more powerful reasoning models East MiroThinker 1.5 of MiroMind, with only 30 billion parameters, compared to the hundreds of billions or trillions used by the main large language model (LLM) foundations.
But MiroThinker 1.5 stands out among these small reasoners for one major reason: it offers agentic search capabilities rivaling competitors with trillions of parameters like Like K2 And Deep searchat a fraction of the cost of inference.
This release marks an important step in the search for effective and deployable AI agents. Businesses have long been forced to choose between costly API calls to pioneering models or compromised local performance. MiroThinker 1.5 offers a third way: open models designed specifically for extensive tool use and multi-step reasoning.
One of the biggest emerging trends in the industry is the move away from highly specialized agents in favor of more generalist agents. Until recently, this capability was largely limited to proprietary models. MiroThinker 1.5 represents a strong contender in this space. Look at my YouTube video above below.
For IT teams evaluating AI deployment, hallucinations remain the biggest barrier to using open models in production. MiroThinker 1.5 addresses this problem through what MiroMind calls “scientific mode”: a fundamental architectural change in how the model handles uncertainty.
Rather than generating statistically plausible answers from memorized patterns (the root cause of most hallucinations), MiroThinker is trained to run a testable research loop: proposing hypotheses, querying external sources for evidence, identifying mismatches, revising conclusions, and checking again. During training, the model is explicitly penalized for high-confidence outputs that lack source support.
The practical implication for enterprise deployment is auditability. When MiroThinker produces a response, it can show both the chain of reasoning and the external sources consulted. For regulated industries such as financial services, healthcare, and legal, this creates a documentation trail that memorization-based models cannot provide. Compliance teams can review not only what the model concluded, but also how it arrived there.
This approach also reduces the problem of “confident hallucinations” common in production AI systems. The model is trained to seek verification rather than extrapolate when uncertain, a behavior that directly translates into fewer costly errors.
In this framework, MiroThinker-v1.5-30B offers performance comparable to models with up to 30 times more parameters, including the trillion-parameter Kimi-K2-Thinking model.
On BrowseComp-ZH, a key benchmark for web search capabilities, the Model 30B actually outperformed its trillion-parameter competitor with a score of 69.8.
The cost difference is also notable. MiroMind reports inference costs as low as $0.07 per call for the 30B variant, about one-twentieth the cost of Kimi-K2-Thinking, as well as faster inference speeds.
A larger variant of 235 B (with 22 B of active parameters in an architecture composed of experts) ranks first globally among several research agent benchmarks. In general agentic search evaluations, these models hold up to DeepSeek V3.2, Minimax, GLM, and Kimi-K2.
In testing, the larger model comes closer to the Gemini 3 Pro on several benchmarks and gets closer to GPT-5-class systems than its parameter count might suggest. As benchmark hillclimbs become more and more common, what matters most is overall competitiveness, and MiroThinker holds up well.
The defining capability of MiroThinker 1.5 is the sustained use of the tool.
The models support up to 256,000 context tokens and claim to support up to 400 tool calls per session, an essential requirement for complex research workflows involving extensive information collection, synthesis, and cross-checking.
This places MiroThinker firmly in the emerging category of agent models designed for autonomous task execution rather than single-round question-and-answer. Practical applications include in-depth research workflows, content pipelines, reporting, and podcast-style outputs similar to NotebookLM.
Another major innovation in MiroThinker 1.5 is its time-sensitive training sandbox.
Traditional model training operates from what MiroMind describes as a “God’s point of view,” where the model has access to finalized results within static data sets, creating hindsight bias. MiroThinker training removes this advantage.
During training, the model can only interact with information released before a given timestamp, thereby preventing future leaks and forcing it to reason under realistic conditions of incomplete information.
The pipeline combines supervised fine-tuning with reinforcement learning using verifiable rewards via Group Relative Policy Optimization (GRPO), an advanced reinforcement learning algorithm popularized by DeepSeek, encouraging the model to select the right tool at the right time.
This approach is particularly relevant for enterprise use cases where models need to reason about evolving situations rather than recalling static facts.
For IT teams considering deployment, hardware requirements remain important. Even the 30B model requires a significant amount of GPU memory, and smaller configurations may struggle.
One advantage is compatibility. MiroThinker runs on vLLM servers with OpenAI-compatible API endpoints, making it easy to integrate into existing toolchains and function calling workflows as a drop-in replacement.
Both model sizes are available under the permissive, business-friendly MIT license on Hugging Face, and an online demo is available for evaluation. Permissive licensing removes major barriers to internal deployment and fine-tuning.
MiroThinker 1.5 arrives as the industry faces the limitations of traditional scaling laws. Larger models no longer guarantee better real-world performance. As Artificial Analysis noted, many benchmarks are saturated, pushing the industry toward assessments based on economic utility rather than abstract reasoning alone.
MiroMind’s bet is on interactive scaling, improving capabilities through deeper interaction with tools rather than an ever-increasing number of parameters. If correct, this could enable sophisticated agents on infrastructure that does not rely on expensive border APIs.
The company, founded by Tianqiao Chen and AI scientist Jifeng Dai, describes its mission as building “native intelligence,” AI that reasons through interaction, not memorization.
The question of whether this approach will become dominant or remain a specialized niche remains open. But for companies grappling with cost-capability tradeoffs, MiroThinker 1.5 offers a compelling data point: Sometimes teaching a model to search is more important than teaching it to memorize everything.