Google's AlphaZero surpassed the sum of human chess knowledge -- in 4 hours

Google’s latest AI, AlphaZero, just defeated the world’s champion chess program Stockfish — after only four hours of learning, by itself, without any human input beyond the game’s rules.

Image via Pixabay.

Mastering chess can take us a lifetime, but with a big enough brain, it will hardly keep you occupied for one afternoon. At least, that’s the case with Google‘s newest AI installment AlphaZero. The program showcased a “superhuman performance” with the game, beating the world’s champion program Stockfish after only four hours’ practice.

To be blunt, this AI managed to surpass the highest peaks of human achievement in chess in half your shift.

Castle and rule

AlphaZero was instructed only on the ruleset of chess and nothing more. Starting without any strategy to use as a crutch, the AI needed only four hours to master the game to such an extent that it destroyed Stockfish — the highest-rated chess-playing program today.

The firm’s DeepMind division says that it played 100 games against Stockfish 8. Each program was given one minute’s worth of thinking time per move. AlphaZero won 25 games in which it played with white (gaining the first-move advantage) and a further three in which it played black. The two programs drew the remaining 72 games.

Stockfish 8 had previously won 2016’s Top Chess Engine Championship. The software was first released in 2008 and has been improved on by volunteers in the years since.

“We now know who our new overlord is,” quipped chess researcher David Kramaley, CEO of chess science website Chessable. “It will no doubt revolutionise the game, but think about how this could be applied outside chess. This algorithm could run cities, continents, universes.”

AlphaZero was developed at Google’s DeepMind labs and is a more generic version of AlphaGo Zero, the AI that ousted the human champion of Go, a Chinese board game considered to be the most difficult strategy game in the world. The Go victory was, so far, considered the bleeding edge of its ability, but DeepMind has kept working on and refining this AI, culminating in a startling success in October: a new, fully autonomous version of the AI, which only learned by playing against itself, never humans, bested all its previous incarnations.

By contrast, AlphaGo Zero’s predecessors learned how to play the game, in part, by watching moves made by human players. This was believed to help the fledgling software improve its game. However, in a slight blow to the human ego, it might have actually hindered the AI, considering that AlphaGo Zero’s fully self-reliant learning was so much more effective in a one-on-one competition.

“What we’re seeing here is a model free from human bias and presuppositions. It can learn whatever it determines is optimal, which may indeed be more nuanced that our own conceptions of the same,” MIT computer scientist Nick Hynes told Gizmodo following the October victory.

“It’s like an alien civilisation inventing its own mathematics.”

But it took AlphaZero less than two months to best even that achievement. In their new paper, the team showcases how the very latest AlphaZero AI takes this self-playing method — called reinforcement learning — and mixes it with a much more generally-applicable frame of thought. All in all, this allows the AI to understand and solve a broader range of problems. It doesn’t play just chess, but also Shogi (Japanese chess) as well as Go — and it took only two and eight hours respectively to master these games.

For now, Google’s scientists aren’t publicly commenting on the research, and the paper is still awaiting peer-review. But for now, one thing is certain: AlphaZero made a lot of waves in the chess community.

“I always wondered how it would be if a superior species landed on Earth and showed us how they played chess,” grandmaster Peter Nielsen told BBC.

“Now I know.”

Who knows, maybe AlphaZero will be the computer to finally crack chess forever.

The paper “Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm” has been published on Cornell University’s site arXiv.