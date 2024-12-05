Technology News
English Edition
Google DeepMind Unveils Genie 2 AI Model, Can Generate Playable 3D Worlds to Train AI Agents

Google said these action-controllable, playable 3D environments can be played by humans or AI agents.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 5 December 2024 19:22 IST
Photo Credit: Google

Google says Genie 2 can generate consistent worlds for up to a minute

Highlights
  • The new AI model is the successor to Genie which was unveiled in February
  • Google DeepMind’s Genie 2 accepts images as an input
  • The tech giant describes Genie 2 as an AI “world model”
Google DeepMind unveiled the successor to the Genie artificial intelligence (AI) model, which could generate endless 2D game worlds, on Wednesday. Dubbed Genie 2, the new AI model is capable of generating unique action-controllable, playable 3D environments based on a single image prompt. Calling Genie 2 an AI “world model”, the company stated that it can generate up to minute-long environments with consistent objects. The company said these generated worlds could be played by humans or can be used to train AI agents.

Google DeepMind Unveils Genie 2 AI Model

In a blog post, the company detailed the new AI model and its capabilities. While its predecessor could only generate game worlds for 2D platformer games, the Genie 2 AI model can generate 3D worlds complete with consistent models that can be interacted with. This means humans or AI agents can walk, run, swim, climb, and perform more actions in these environments.

Genie 2's generative capabilities allow it to generate routes, buildings, and objects that cannot be seen in the input image. These elements are designed and rendered by the model from scratch. Additionally, the foundation model is also capable of maintaining consistency in these environments. This means even when a player moves away from one area and returns back, the environments remain the same.

Apart from this, Genie 2 is capable of generating different perspectives such as first-person views, isometric views, or third-person views. Further, users can also interact with the objects in the generated worlds and can perform actions such as opening a door, bursting a balloon, or climbing a ladder. The model can also be prompted to generate physics-related effects such as water ripples, smoke, gravity, directional lighting, reflections, and more.

Coming to the technical details, DeepMind explained that Genie 2 is an autoregressive latent diffusion model and has been trained on a large video dataset. The transformer architecture also includes an autoencoder which enables frame-by-frame generation of these worlds.

Notably, DeepMind also released an AI model dubbed Scalable Instructable Multiworld Agent or SIMA earlier this year, which is essentially capable of agentic AI functions in 3D worlds. The company says Genie 2 is capable of providing unique environments to similar AI agents and training them for various real-life scenarios.

Since the world model can generate unique environments, Google says this will eliminate the risk of data contamination and will allow developers to correctly assess an AI agent's capabilities.

Comments

Google, Genie 2, AI, Artificial Intelligence, 3D
Akash Dutta
Akash Dutta
Akash Dutta is a Senior Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More
My First Gran Turismo, a Free-to-Play Racing Sim Experience for PS4 and PS5, Arrives December 6

