Google DeepMind unveiled the successor to the Genie synthetic intelligence (AI) mannequin, which may generate limitless 2D sport worlds, on Wednesday. Dubbed Genie 2, the brand new AI mannequin is able to producing distinctive action-controllable, playable 3D environments based mostly on a single picture immediate. Calling Genie 2 an AI “world mannequin”, the corporate acknowledged that it could possibly generate as much as minute-long environments with constant objects. The corporate mentioned these generated worlds may very well be performed by people or can be utilized to coach AI brokers.
Google DeepMind Unveils Genie 2 AI Mannequin
In a weblog publish, the corporate detailed the brand new AI mannequin and its capabilities. Whereas its predecessor may solely generate sport worlds for 2D platformer video games, the Genie 2 AI mannequin can generate 3D worlds full with constant fashions that may be interacted with. This implies people or AI brokers can stroll, run, swim, climb, and carry out extra actions in these environments.
Genie 2’s generative capabilities permit it to generate routes, buildings, and objects that can not be seen within the enter picture. These components are designed and rendered by the mannequin from scratch. Moreover, the muse mannequin can also be able to sustaining consistency in these environments. This implies even when a participant strikes away from one space and returns again, the environments stay the identical.
Aside from this, Genie 2 is able to producing totally different views comparable to first-person views, isometric views, or third-person views. Additional, customers also can work together with the objects within the generated worlds and might carry out actions comparable to opening a door, bursting a balloon, or climbing a ladder. The mannequin can be prompted to generate physics-related results comparable to water ripples, smoke, gravity, directional lighting, reflections, and extra.
Coming to the technical particulars, DeepMind defined that Genie 2 is an autoregressive latent diffusion mannequin and has been skilled on a big video dataset. The transformer structure additionally consists of an autoencoder which permits frame-by-frame era of those worlds.
Notably, DeepMind additionally launched an AI mannequin dubbed Scalable Instructable Multiworld Agent or SIMA earlier this yr, which is actually able to agentic AI capabilities in 3D worlds. The corporate says Genie 2 is able to offering distinctive environments to related AI brokers and coaching them for varied real-life eventualities.
Because the world mannequin can generate distinctive environments, Google says this may remove the danger of knowledge contamination and can permit builders to accurately assess an AI agent’s capabilities.