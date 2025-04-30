Xiaomi on Tuesday released an open-source reasoning-focused artificial intelligence (AI) model. Dubbed MiMo, the family of reasoning models innovate the optimisation of reasoning capability in a relatively smaller parameter size. This is also the first open-source reasoning model by the tech giant, and it competes with Chinese models such as DeepSeek R1 and Alibaba's Qwen QwQ-32B, and global reasoning models including OpenAI's o1 and Google's Gemini 2.0 Flash Thinking. The MiMo family comprises four different models, each with unique use cases.

With the MiMo series of AI models, Xiaomi researchers aimed to solve the size problem in reasoning AI models. Reasoning models (at least ones that can be measured) have around 24 billion or more parameters. The large size is kept to achieve uniform and simultaneous improvements in both coding and mathematical capabilities of large language models, something considered difficult to achieve with smaller models.

In comparison, MiMo features seven billion parameters, and Xiaomi claims that its performance matches OpenAI's o1-mini and outperforms several reasoning models with 32 billion parameters. The researchers claimed that the base AI model was pre-trained on 25 trillion tokens.

The researchers claimed that such efficiency was achieved by optimising data preprocessing pipelines, enhancing text extraction toolkits, and applying multidimensional data filtering. Further, MiMo's pre-training included a three-stage data mixture strategy.

Based on internal testing, the Xiaomi researchers claim that the MiMo-7B-Base scores 75.2 on the BIG-Bench Hard (BBH) benchmark for reasoning capabilities. The zero-shot reinforcement learning (RL)-based MiMo-7B-RL-Zero is claimed to excel in mathematics and coding-related tasks, and scores 55.4 on the AIME benchmark, outperforming o1-mini by 4.7 points.

As MiMo is an open-source AI model, it can be downloaded from Xiaomi's listing on GitHub and Hugging Face. The technical paper details the model's architecture as well as the pre-training and post-training processes. It is a text-based model and does not have multimodal capabilities. Similar to most open-source releases, the details about the model's dataset is not known.