Yesterday, Microsoft’s Xbox team unveiled a new innovation called Muse, described as “a generative AI model designed for gameplay ideation.” Accompanying this announcement were an open-access article on Nature.com, a detailed blog entry, and an engaging YouTube video. If the term “gameplay ideation” leaves you scratching your head, Microsoft explains it as the ability to create “game visuals, controller actions, or both.” However, its practical applications are relatively constrained and don’t bypass the traditional game development process.
Nonetheless, the data Microsoft shared offers some intriguing insights. The AI was trained extensively on H100 GPUs, requiring around a million updates to transform one second of actual gameplay into nine seconds of additional simulated footage that accurately reflects the game engine. Most of the training data was sourced from existing multiplayer gameplay instances.
Microsoft needed a whole cluster of 100 Nvidia H100 GPUs to train this model, which was far more costly and energy-consuming than running the game on just one PC. Despite the hefty resources involved, the model only achieved an output resolution of 300×180 pixels while generating the extra nine seconds of gameplay.
What stood out during the Muse showcase was its ability to replicate existing props and enemies in the game environment and imitate their functions. It’s interesting to ponder what could have been achieved with all this technology and expense if conventional development tools were used to spawn enemies or props instead.
Despite Muse’s capacity to faithfully maintain object permanence and replicate the original game mechanics, its current applications appear somewhat inefficient compared to the tried-and-true video game development methods. While future iterations of Muse might accomplish more compelling tasks, it currently joins a long line of projects aiming to simulate gameplay purely within AI parameters. Although the AI does retain a degree of engine accuracy and object permanence, this approach feels far from optimal for developing, testing, or playing a game. After spending considerable time analyzing the specifics, it’s hard to see why one would opt for this method over established ones.