Categorie: News

From smartphones to robots: Xiaomi launches Robotics-0 and it’s open source

Xiaomi has officially expanded its tech horizons by entering the field of advanced robotic research. After MiMo (the ChatGPT rival) from Lei Jun’s company, Xiaomi presented Xiaomi-Robotics-0, its first large-scale Vision-Language-Action (VLA) model released in open-source mode.

With an architecture based on 4.7 billion parameters, the system was designed to integrate visual understanding, language reasoning and real-time execution of physical actions into a single model.

Xiaomi-Robotics-0: The revolution of intelligent robotics goes open source

Credits: Xiaomi

The main innovation of Xiaomi-Robotics-0 lies in its Mixture-of-Transformers (MoT) architecture, which simulates the collaboration between the brain and the human cerebellum. The system assigns command understanding to a linguistic-visual model (VLM) that acts as the “brain”, capable of interpreting human instructions even when vague and analyzing spatial relationships through high-definition video input.

Parallel to this, movement management is delegated to an Action Expert that uses a diffusion transformer (DiT) to generate fluid and precise movement sequences. The task separation allows balancing deep logical reasoning with extremely fine motor control, preventing the robot from losing its general understanding capabilities while learning new physical tasks.

A key aspect that makes this model extremely practical for developers is the solution adopted to eliminate micro-stutters and instability in movements, often caused by processing latency. Xiaomi has introduced asynchronous inference, a technique that decouples the model’s reasoning process from the robot’s physical execution, ensuring continuity of action even when the system requires more time to process a complex command.

Credits: Xiaomi

Moreover the necessary safeguards enable the robot to prioritize immediate visual feedback over historical memory, making it capable of reacting instantly to sudden changes in the surrounding environment.

The performance of Xiaomi-Robotics-0 has already been validated by benchmarks (such as LIBERO, CALVIN and SimplerEnv) where the model outperformed dozens of competing systems. In real tests, dual-arm robots successfully completed long-range tasks, such as disassembling construction blocks and handling soft and flexible objects.

An important detail is hardware compatibility: the model supports real-time inference even on consumer-grade GPUs, drastically lowering the entry barriers for research and development in robotics.

The Chinese company has released the source code and model weights of Xiaomi-Robotics-0 on platforms such as GitHub and Hugging Face. Also on GitHub you can find the project’s main page at the project’s main page.

Gabriele Cascone

Innamorato della tecnologia, con un occhio di riguardo verso smartphone e gaming, è legato indissolubilmente al mondo Nerd. Serie TV, film, giochi, manga, anime e comics sono all'ordine del giorno.