GENIE
– SCI & TECH
News:
Explained: Google
DeepMind’s Genie, an AI model that creates virtual worlds from image prompts
What's
in the news?
●
The biggest draw of video games is the
escapism or the fantasy of a world far removed from our immediate reality.
●
Google DeepMind has just introduced Genie,
a new model that can generate interactive video games from just a text or image
prompt.
Genie
AI Model:
●
It is a foundation world model that is trained on videos sourced from the
Internet.
●
The model can “generate an endless variety
of playable (action-controllable) worlds from synthetic images, photographs,
and even sketches.”
●
It is the first generative interactive environment that has been trained in
an unsupervised manner from unlabelled internet videos.
Specifications:
●
When it comes to size, Genie stands at 11B
parameters and consists of a spatiotemporal video tokenizer, an autoregressive
dynamics model and a simple and scalable latent action model.
●
These technical specifications let Genie
act in generated environments on a
frame-by-frame basis even in the absence of training, labels, or any other
domain-specific requirements.
Characteristics:
●
Genie can be prompted to generate a diverse set of interactive and controllable
environments although it is trained on video-only data.
●
It makes playable environments from a single image prompt.
●
It can be prompted with images it has
never seen. This includes real world photographs, sketches, allowing people to
interact with their imagined virtual worlds.
●
It is trained
more on videos of 2D platformer games and robotics.
●
Genie is trained on a general method,
allowing it to function on any type of domain, and it is scalable to even
larger Internet datasets.
●
The standout aspect of Genie is its
ability to learn and reproduce controls for in-game characters exclusively from
internet videos.
●
This is noteworthy because internet videos
do not have labels about the action that is performed in the video, or even
which part of the image should be controlled.
●
It allows you to create an entire new
interactive environment from a single image.