MuZero: the story on HearLore

When was MuZero first published by DeepMind?

A team at DeepMind published a preprint introducing MuZero on the 19th of November 2019. This event marked the public debut of an algorithm designed to master games without knowing their rules.

How does MuZero differ from AlphaZero in terms of game rules?

MuZero combines tree-based search with learned models to operate without explicit game rules unlike its predecessor AlphaZero. The system learns state transitions through observation rather than code and does not use a simulator that knows the rules.

What performance benchmarks did MuZero achieve against AlphaZero?

Initial results showed MuZero matched AlphaZero's performance in chess after roughly one million training steps. The system surpassed AlphaZero's performance in Go by one million steps and improved on the state of the art in mastering fifty-seven Atari games.

Which hardware configurations were used for training board games versus visual environments?

The team used sixteen third-generation tensor processing units for training board games and deployed one thousand TPUs for self-play during those sessions. Visual environments required eight TPUs for training and thirty-two for self-play.

When was EfficientZero proposed as a variant of the original framework?

In late 2021 researchers proposed EfficientZero as a more efficient variant of the original framework. This version achieved 194.3 percent mean human performance with just two hours of real-time game experience.

MuZero.

Learning Without Rules

Matching Human Masters

Up Next

Continue Browsing

Common questions