The classic "man vs. machine" set-up ended in human victory back in 2015, two years later, the machines claimed revenge.
Two years ago a Doug Polk-Dong Kim-Bjorn Li-Jason Les line-up beat Carnegie Mellon University’s first attempt at a no limit hold’em heads-up playing GAI software, Claudico 8. As we reported earlier, two years later, the rematch took place at the Rivers Casino, Pittsburgh, PA with different participants on both sides. Dong Kim and Jason Les returned on the human side, but two new pros were brought in for the challenge: Jimmy “ForTheSwaRMm” Chou and Daniel “Dougiedan678” McAulay; while Carnegie Mellon University spent two years developing the new GAI program, Libratus.
Before the heads-up match, Prof. Toumas Sandholm, one of the creators of Libertus called no-limit Texas hold’em the “last frontier (…) of game-solving in AI”, since in other games, such as chess AI has already surpassed human capabilities reliably, unlike in NL hold’em poker.
Well, the last frontier appears to have been broken, since after 120,000 hands of heads-up played, Libratus came out victorious, taking $1,766,250 in chips combined from its human opponents.
Jason Les had this to say about the new AI in a video posted on Carnegie Mellon University’s YouTube channel: “It’s quite a bit better than Claudico in 2015. It made pretty large mistakes at times, you don’t see that with Libratus. It is much more calculated and much more tough, you really have to pry every chip you can out of Libratus’s hand, and you have to do it with skill, not counting on his mistake. You don’t often see [someone] play like Libratus, that’s like ‘250%, 500%’; all-in for like 2000 in the middle, Libratus is al-in for 19 000.” In a Reddit AMA thread, Les also said that he believes the new AI development can be a problem for the future of poker.
The experiment was still worthwhile for the four pro players, since they chopped the $200 000 prize pool money among themselves based on their performance relative to each other. On the other side, developers Sandolm and Noam Brown are evidently satisfied with the results and say they plan to apply what they have learnt to other fields that are “games of incomplete information” such as negotiation, cyber security or medical treatment planning.