By Insights Team in AI — 25 Jun 2025

Just Released: The First Embodied Gemini That Runs Locally on Robots}

Google DeepMind unveils Gemini Robotics On-Device, a vision-language-action model that runs directly on robots, enabling fast, efficient, and offline capable robotic interactions. Open source now available.

Today, the Gemini family welcomes a new member: Gemini Robotics On-Device.

This is Google DeepMind's first visual-language-action (VLA) model deployable directly on robots, helping them adapt faster and more efficiently to new tasks and environments without needing constant internet connectivity.

Named as part of the Gemini series, Gemini Robotics On-Device was released in March this year, based on the multi-modal reasoning capabilities of Gemini 2.0.

It demonstrates strong generality and task transferability, optimized for efficient operation on robotic hardware. Since it can run without data network, it is highly suitable for latency-sensitive applications, ensuring robustness in disconnected or offline environments.

Netizens have responded positively:

Google will also release the Gemini Robotics SDK, allowing developers to evaluate its performance in various tasks and environments. The SDK can be tested in DeepMind’s MuJoCo physics simulator, with only 50-100 demonstrations needed to adapt to new domains.

Additionally, the MuJoCo Playground, jointly developed by UC Berkeley, Google DeepMind, University of Toronto, and Cambridge University, recently won the Best Demo Award at RSS 2025.

Paper Title: Demonstrating MuJoCo Playground
Paper Link: https://www.roboticsproceedings.org/rss21/p020.pdf

Model Capabilities and Performance

Gemini Robotics On-Device is designed for dual-arm robots, focusing on minimal resource consumption. It supports quick agile operations, fine-tuning for new tasks, and low-latency local inference.

DeepMind conducted extensive tests on visual, semantic, and behavioral generalization, showing the model can follow natural language commands and perform dexterous tasks like zippering or folding clothes—all directly on the robot.

Even in local mode, Gemini Robotics On-Device shows impressive generalization performance.

Compared to previous best local robot models, Gemini Robotics On-Device has clear advantages, especially in challenging out-of-distribution tasks and complex multi-step instructions.

Developers can also use the Gemini Robotics model without strict local deployment requirements. For detailed technical info, see the report: https://arxiv.org/pdf/2503.20020

Task Adaptability and Cross-Embodiment Generalization

Gemini Robotics On-Device is the first DeepMind VLA model that supports fine-tuning. It can adapt to new tasks with only 50-100 demonstrations, showing strong generalization to novel tasks.

DeepMind tested it on seven dexterous tasks, including zippering lunch boxes, drawing cards, and pouring salad dressing. The following image shows the model’s task adaptation with fewer than 100 examples:

They also explored how the model can be adapted for different robots. Trained on ALOHA robots, it was further adjusted for dual-arm Franka robots and humanoid robots like Apptronik’s Apollo.

On Franka dual-arm robots, the model can perform general instructions, handle unseen objects and scenarios, and execute dexterous tasks like folding dresses or precise industrial assembly.

The same versatile model can also control different objects on the Apollo humanoid robot, following natural language commands and handling unseen objects effectively.

DeepMind states: "Gemini Robotics On-Device marks a step forward in making powerful robot models more accessible and adaptable."

It seems we are one step closer to the era of embodied intelligence.

Other Updates on Gemini Models

Besides Gemini Robotics On-Device, Google DeepMind also announced a reduction in free usage quotas for Gemini 2.5 Flash, from 500 to 250 requests per day, and for Gemini 2.0 Flash, from 1500 to 200 requests per day. https://x.com/ai_for_success/status/1937493142279971210

Google AI Studio and Gemini API also introduced new image generation models: Imagen 4 and Imagen 4 Ultra. These are now available for free trial in Google AI Studio.

We also tested Imagen 4 Ultra, which generated a colorful ink wash painting featuring a cat, robot, and alien:

Reference links:

Subscribe to QQ Insights