Watch a robot navigate the Google DeepMind offices using Gemini

Generative AI has shown promise in various applications in robotics, including natural language interactions, robot learning, and design. Google’s DeepMind Robotics team has demonstrated the use of Google Gemini 1.5 Pro to teach a robot to respond to commands and navigate an office space. The robot successfully follows commands to lead humans to different locations within the office, showcasing its ability to understand and respond to instructions. The team used a method called Multimodal Instruction Navigation with demonstration Tours (MINT) to familiarize the robot with the office space, combining environment understanding and common sense reasoning power. The robot can respond to written and drawn commands, as well as gestures, showcasing its ability to navigate and interact in a real-world setting. Google reports a 90% success rate in over 50 interactions with employees, highlighting the effectiveness of the robot’s navigation capabilities. This demonstration highlights the potential for generative AI to enhance robot navigation and interaction in various real-world scenarios.

