When was MuZero first published by DeepMind?
A team at DeepMind published a preprint introducing MuZero on the 19th of November 2019. This event marked the public debut of an algorithm designed to master games without knowing their rules.
A team at DeepMind published a preprint introducing MuZero on the 19th of November 2019. This event marked the public debut of an algorithm designed to master games without knowing their rules.
MuZero combines tree-based search with learned models to operate without explicit game rules unlike its predecessor AlphaZero. The system learns state transitions through observation rather than code and does not use a simulator that knows the rules.
Initial results showed MuZero matched AlphaZero's performance in chess after roughly one million training steps. The system surpassed AlphaZero's performance in Go by one million steps and improved on the state of the art in mastering fifty-seven Atari games.
The team used sixteen third-generation tensor processing units for training board games and deployed one thousand TPUs for self-play during those sessions. Visual environments required eight TPUs for training and thirty-two for self-play.
In late 2021 researchers proposed EfficientZero as a more efficient variant of the original framework. This version achieved 194.3 percent mean human performance with just two hours of real-time game experience.