Home
Ai
Ai News
Google Genie, an AI Model That Can Generate 2D Platformer Games, Introduced; How It Works

Google Genie, an AI Model That Can Generate 2D Platformer Games, Introduced; How It Works

Google’s Genie AI model was trained on 2,00,000 hours of videos from 2D platformers.

Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 28 February 2024 13:01 IST

Facebook

Tweet LinkedIn Reddit Email Comment

Google News

Google Genie, an AI Model That Can Generate 2D Platformer Games, Introduced; How It Works

Photo Credit: Google

Google Genie is currently not available to the public

Highlights

Google defined Genie as an action-controllable world model
The AI model uses predictive analysis to generate video games
Google Genie can convert any image into a playable 2D world

Google has introduced another generative artificial intelligence (AI) model that can create endless numbers of 2D platformer video games. Genie is being touted as an action-controllable world model that was trained on unsupervised video game data. It uses predictive analysis to generate video game levels and can also control a playable character and determine its movements. Interestingly, OpenAI also introduced a world model earlier this month called Sora, which can generate hyperrealistic videos of up to one minute in length.

The announcement was made by Tim Rocktäschel, Open-Endedness Team Lead, Google DeepMind, via a series of posts on X (formerly known as Twitter). He said, “We introduce Genie, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.” Genie is unique in the aspect that it can only generate one specific thing, and it is also the only video game-generating model that has been publicly announced so far.

Google's Genie AI model is not open to the public yet and only exists as a research model for now. This is why its user-centric functionalities are not known yet. It can generate video game levels using images, but whether it can take text prompts or even video prompts is not known. A preprint version of the paper was posted online which highlights its technical aspects. The AI model was trained unsupervised on 2,00,000 hours of video game footage and contains 11 billion parameters. The architecture of the model uses three different parts — a spatiotemporal video tokenizer, an autoregressive dynamics model, and a simple and scalable latent action model.

Samsung Galaxy A55 Price, Specifications Leak; Might Launch on This Date

How Google Genie Works

To simplify, the spatiotemporal video tokenizer takes video game footage, breaks it down into smaller chunks of datasets, known as tokens, that can be consumed by the foundation model. Spatiotemporal explains that the data is broken down both in time and space (For example, a video was broken down into 2-second clips, but each frame was also broken down into multiple pieces).

The autoregressive dynamic model comes next. Autoregressive models essentially predict the future based on how something has performed in the past, and a dynamic model is responsible for understanding how things change and move over time. So this part is where the predictive analysis begins. The final component is the latent action model. This is where the AI understands how the playable character moves and traverses in the video game world.

“Genie's learned latent action space is not just diverse and consistent, but also interpretable. After a few turns, humans generally figure out a mapping to semantically meaningful actions (like going left, right, jumping etc.),” said Rocktäschel. This part is important because it highlights that the main problem this AI model solves is not just generating 2D video game levels, but also understanding how basic movements occur, and how that information can be used to navigate real-world terrains.

Highlighting this, he added, “Genie's model is general and not constrained to 2D. We also train a Genie on robotics data (RT-1) without actions, and demonstrate that we can learn an action controllable simulator there too. We think this is a promising step towards general world models for AGI.”

Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.

Affiliate links may be automatically generated - see our ethics statement for details.

Comments

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Google, Artificial Intelligence, AI

Akash Dutta Email Akash Dutta

Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his... more »