We’ve come to understand that human players will never stand a chance against a computer with enough fire power at finite and open games like checkers or chess. Poker is sensibly different because the computer doesn’t know his human opponent’s hands. No matter, a group of computer scientists from the University of Alberta in Canada have programmed an AI to play heads-up Texas hold’em poker so good (or perfect according to its authors) that it can’t lose in the face of any opponent. There’s a gist though – it can only win at limit Texas hold’em, where there’s a limit to how much you can bet, otherwise it just can’t handle it.
Poker has been solved – almost
The computer program in question is called Cepheus and is the first to play an essentially perfect game of poker, according to its makers. In the simplest terms, the AI is programmed to play the game with a strategy that is guaranteed to not lose money in the long run, statistically speaking. In other words, yes the computer will lose some hands, but given enough time and hands it’s certain to win.
It’s more than just winning at Poker, though. It’s about setting a milestone in artificial intelligence and game theory. Previously, games like checkers, connect 4, Othello and chess had been solved, leaving little chances for a human player to best the computer. All of these games, however, have in common one important characteristic: they all have perfect information, where all players have all of the relevant information to make their decisions. In this respect, Poker is the antithesis of perfect information where the one most relevant piece of information, the other players’ cards, is exactly what is not known.
Overall, in any given game of heads-on limit hold’em Poker there are 3 x 10^14 possible decisions, which means a computer would have to run on 262 terabytes of memory to be able to solve it. Luckily, using algorithms you can refine the play and only consider those decisions that are important. The Canadian team devised such an algorithm called counterfactual regret minimisation (CFR), which essentially is all about learning from past mistakes. When Cepheus thinks about placing a bet and decides to eventually bet random, if it loses it will store that information. Later, it will retrace its steps and see how much it would have won had the program placed a bet correctly – this is the regret value. This way, the algorithm will avoid making the same mistake twice, the authors write in the journal Science.
Given enough hands, it will eventually go through enough mistakes to play almost perfectly. In fact, it played 24 trillion hands of poker over 70 days on 200 computers running the CFR+ algorithm with 32 GB of RAM and 24 central processing units. Now, Cepheus only uses 11 terabytes of ram. Next up, the researchers plan on tweaking Cepheus to play no-limit poker, where any bet of any value can be submitted. In this case, the number of possible combinations are absolutely astronomical, but even so the authors hope the computer can beat even the best players in the world.
Story via ASAP Science