畳み込みネットワークによるNo-Limit Hold'emの研究

黄, 柱皓, 金子, 知適, Juho, Hwang, Tomoyuki, Kaneko

カードゲームの一種であるポーカーには様々なバリエーションがあるが，その中でも特に人気のあるTexas Hold'em 形式のゲームについて多くの研究がなされてきた．その多くはナッシュ均衡を求めることによりゲームを解こうとするものであったが，このような手法は多くのNo-Limit ゲームのように計算量が膨大になるゲームにあっては同様の適用が難しいと考えられる．近年，畳み込みニューラルネットワーク(CNN) を用いて盤面情報をパターン認識問題として学習することでポーカーをプレイするエージェントが提案されている．本稿では，この手法をNo-Limit Hold'em に適用することを考え，非常に単純なヒューリスティックプレイヤー同士，またCNN によるプレイヤー同士の自己対戦によるハンドヒストリーの学習を繰り返すことで訓練されるプレイヤーについて議論する．実験では，世代が進むごとに以前の世代の弱点を学習し効果的に利用する，強化されたプレイヤーが得られた．

Poker is a family of card games that has many variants. There have been numerous studies on Texas Hold'em, one of the most popular of the family. A signi cant portion of the studies is on nding the game equilibrium, but there are difficulties in applying this approach directly on to the No-Limit variants often with considerably greater number of game states. Recently, a Convolutional Neural Network (CNN) based poker agent that tries to learn the game state as a pattern recognition problem was proposed, and this paper attempts to apply the methods to the domain of the No-Limit variant. We discuss a poker agent that studies from hand histories played by a very simple heuristic player initially, and from self-played histories of CNN-trained models. The self-trained poker models were able to effectively train from, and exploit the weaknesses of previous generations.

畳み込みネットワークによるNo-Limit Hold'emの研究

書誌事項

説明

収録刊行物

詳細情報詳細情報について

書き出し

問題の指摘

畳み込みネットワークによるNo-Limit Hold'emの研究

書誌事項

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について