Categorie: News

From smartphones to robots: Xiaomi launches Robotics-0 and it’s open source

Xiaomi has officially expanded its tech horizons by entering the field of advanced robotic research. After MiMo (the ChatGPT rival) from Lei Jun’s company, Xiaomi presented Xiaomi-Robotics-0, its first large-scale Vision-Language-Action (VLA) model released in open-source mode.

With an architecture based on 4.7 billion parameters, the system was designed to integrate visual understanding, language reasoning and real-time execution of physical actions into a single model.

Xiaomi-Robotics-0: The revolution of intelligent robotics goes open source

Credits: Xiaomi

The main innovation of Xiaomi-Robotics-0 lies in its Mixture-of-Transformers (MoT) architecture, which simulates the collaboration between the brain and the human cerebellum. The system assigns command understanding to a linguistic-visual model (VLM) that acts as the “brain”, capable of interpreting human instructions even when vague and analyzing spatial relationships through high-definition video input.

Parallel to this, movement management is delegated to an Action Expert that uses a diffusion transformer (DiT) to generate fluid and precise movement sequences. The task separation allows balancing deep logical reasoning with extremely fine motor control, preventing the robot from losing its general understanding capabilities while learning new physical tasks.

A key aspect that makes this model extremely practical for developers is the solution adopted to eliminate micro-stutters and instability in movements, often caused by processing latency. Xiaomi has introduced asynchronous inference, a technique that decouples the model’s reasoning process from the robot’s physical execution, ensuring continuity of action even when the system requires more time to process a complex command.

Credits: Xiaomi

Moreover the necessary safeguards enable the robot to prioritize immediate visual feedback over historical memory, making it capable of reacting instantly to sudden changes in the surrounding environment.

The performance of Xiaomi-Robotics-0 has already been validated by benchmarks (such as LIBERO, CALVIN and SimplerEnv) where the model outperformed dozens of competing systems. In real tests, dual-arm robots successfully completed long-range tasks, such as disassembling construction blocks and handling soft and flexible objects.

An important detail is hardware compatibility: the model supports real-time inference even on consumer-grade GPUs, drastically lowering the entry barriers for research and development in robotics.

The Chinese company has released the source code and model weights of Xiaomi-Robotics-0 on platforms such as GitHub and Hugging Face. Also on GitHub you can find the project’s main page at the project’s main page.

Gabriele Cascone

Innamorato della tecnologia, con un occhio di riguardo verso smartphone e gaming, è legato indissolubilmente al mondo Nerd. Serie TV, film, giochi, manga, anime e comics sono all'ordine del giorno.

Recent Posts

YouTube Music Has a Major Problem with Wear OS

More and more people are turning to smartwatches powered by the Wear OS operating system…

2 hours ago

Honor Magic 8 Pro joins the Android 17 beta program

The Chinese company has announced the opening of the beta program dedicated to Android 17:…

3 hours ago

Is a new Amazon Fire Phone on the way? Panos Panay comments on the rumors

Over ten years after the high-profile commercial failure of the Fire Phone, Amazon seems to…

3 hours ago

Xiaomi Electric Scooter 6 Series now available in Italy: prices and updates

The device is characterized by a yellow color and top features: a powerful motor of…

4 hours ago

The iPhone camera app is about to receive a major update

Ahead of the annual developers' conference WWDC, which will open its doors on June 8,…

4 hours ago

Snapdragon 8 Gen 6 is extremely expensive, get ready for even pricier flagship devices

If you thought that the current memory shortages were the only factor capable of driving…

5 hours ago