Google DeepMind Unveils Gemini Robotics AI Models That Can Control Robots in the Real World

Google DeepMind unveiled Gemini Robotics and Gemini Robotics-ER (embodied reasoning) AI models.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 13 March 2025 14:33 IST
Highlights
  • Google is partnering with Apptronik to build humanoid robots
  • Gemini Robotics offer generality, interactivity, and dexterity
  • The models were trained on data from the robotic platform Aloha 2

Gemini Robotics-ER focuses on spatial reasoning in real-world environments

Photo Credit: Google

Google DeepMind unveiled two new artificial intelligence (AI) models on Thursday, which can control robots to make them perform a wide range of tasks in real-world environments. Dubbed Gemini Robotics and Gemini Robotics-ER (embodied reasoning), these are advanced vision language models capable of displaying spatial intelligence and performing actions. The Mountain View-based tech giant also revealed that it is partnering with Apptronik to build Gemini 2.0-powered humanoid robots. The company is also testing these models to evaluate them further, and understand how to make them better.

Google DeepMind Unveils Gemini Robotics AI Models

In a blog post, DeepMind detailed the new AI models for robots. Carolina Parada, the Senior Director and Head of Robotics at Google DeepMind, said that for AI to be helpful to people in the physical world, they would have to demonstrate “embodied” reasoning — the ability to interact and understand the physical world and perform actions to complete tasks.

Gemini Robotics, the first of the two AI models, is an advanced vision-language-action (VLA) model which was built using the Gemini 2.0 model. It has a new output modality of “physical actions” which allows the model to directly control robots.

Advertisement

DeepMind highlighted that to be useful in the physical world, AI models for robotics require three key capabilities — generality, interactivity, and dexterity. Generality refers to a model's ability to adapt to different situations. Gemini Robotics is “adept at dealing with new objects, diverse instructions, and new environments,” claimed the company. Based on internal testing, the researchers found the AI model more than doubles the performance on a comprehensive generalisation benchmark.

Advertisement

The AI model's interactivity is built on the foundation of Gemini 2.0, and it can understand and respond to commands phrased in everyday, conversational language and different languages. Google claimed that the model also continuously monitors its surroundings, detects changes to the environment or instructions, and adjusts its actions based on the input.

Finally, DeepMind claimed that Gemini Robotics can perform extremely complex, multi-step tasks that require precise manipulation of the physical environment. The researchers said the AI model can control robots to fold a piece of paper or pack a snack into a bag.

Advertisement

The second AI model, Gemini Robotics-ER, is also a vision language model but it focuses on spatial reasoning. Drawing from Gemini 2.0's coding and 3D detection, the AI model is said to display the ability to understand the right moves to manipulate an object in the real world. Highlighting an example, Parada said when the model was shown a coffee mug, it was able to generate a command for a two-finger grasp to pick it up by the handle along a safe trajectory.

The AI model performs a large number of steps necessary to control a robot in the physical world, including perception, state estimation, spatial understanding, planning, and code generation. Notably, neither of the two AI models is currently available in the public domain. DeepMind will likely first integrate the AI model into a humanoid robot and evaluate its capabilities, before releasing the technology.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Xiaomi Pad 8 Launched in India With Snapdragon 8s Gen 4 SoC, 9,200mAh Battery
  2. Vivo T5x 5G Will Launch in India Next Week With These Features
  3. Poco C85x 5G Debuts With a 6,300mAh Battery at This Price in India
  4. Here's When the OnePlus Nord Buds 4 Pro Will Launch in India
  5. Oppo K14x 5G Gets More Affordable 4GB RAM, 64GB Storage Variant in India
  6. iQOO Z11 Teased With 165Hz Display, 9,020mAh Battery; China Launch Expected Soon
  7. Anthropic Introduces Agentic Code Review Tool to Claude Code
  8. Vivo Teases X300 Ultra's 400mm Teleconverter Kit Performance Ahead of Debut
  9. Shinji Mikami's New Studio, Unbound Games, Is Working on 'New Original IP'
  1. Microsoft’s New Copilot Cowork Can Take Actions and Autonomously Complete Tasks
  2. Lenovo Tab Plus Gen 2 Spotted in Leaked Renders That Point to Significant Design Overhaul
  3. Oppo Find X9 Ultra, Find X9s Reportedly Bag Thailand's NBTC Certification Ahead of Anticipated Launch
  4. Bhutan Moves Over $11 Million Worth of Bitcoin From Government Holdings, Arkham Data Shows
  5. Oppo K14x 5G Gets New 4GB RAM, 64GB Storage Variant in India: Price, Specifications
  6. Shinji Mikami's New Studio, Unbound Games, Is Working on 'New Original IP' for PC, PS5 and Xbox
  7. OnePlus Nord Buds 4 Pro India Launch Date, Key Features and Availability Details Announced
  8. Vivo Product Manager Teases Vivo X300 Ultra's 400mm Teleconverter Kit Performance Ahead of Debut
  9. OpenAI to Acquire AI Security Platform Promptfoo, Build New Enterprise Capabilities in Frontier
  10. Vivo T5x 5G India Launch Date Announced; to Feature Dimensity 7400 Turbo Chip, 7,200mAh Battery
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.