One year since the bombshell announcement that DeepMind’s AlphaZero needed only the rules of chess and four hours of self-play to be able to beat Stockfish in a match, the long-awaited full paper has now been published in the academic journal Science. We have new games – Matthew Sadler has produced videos about five of them – and what seems conclusive evidence of AlphaZero’s superiority. It won a new match 574.5:425.5, despite Stockfish running in a powerful configuration and managing its own time. AlphaZero also won when given just 1/10th of the time to think.
AlphaZero would be extraordinary even if it had only reached “human” levels of attainment. It began as AlphaGo, that learned from human games to become the world’s best Go player, then developed into AlphaGoZero, that managed to surpass AlphaGo merely by playing against itself with no human input. AlphaZero is the new generalised version of that “reinforcement and search algorithm”, that the DeepMind team have shown can master multiple games – chess, shogi and Go – knowing only the rules. In the case of chess AlphaGo needed 300,000 of the 700,000 “steps” it took while training – just 4 hours (of 9 in total) – to reach a level at which it was beating Stockfish.
During the World Championship match we were featuring content from 2-time British Champion Matthew Sadler and WIM Natasha Regan, who are co-authoring Game Changer. They appear in this short video looking at AlphaZero:
- DeepMind on AlphaZero
- The full AlphaZero paper for Science
- Matthew Sadler’s chess24 profile
- Matthew Sadler’s chess24 video series
- Jan Gustafsson: Learn from AlphaZero & beat the Queen’s Indian
- AlphaZero crushes chess
- Carlsen-Caruana 2018 World Championship games
- AlphaZero on Carlsen-Caruana Games 1-8
- AlphaZero on Carlsen-Caruana Games 9-12