モンテカルロ将棋における遷移確率を用いたプレイアウトの改良

宇賀神, 拓也, Ugajin, Takuya

モンテカルロ法はさまざまなゲームで好成績をあげていることからゲームプログラムにおいて有用であることがわかっており，またプレイアウト部分にゲームの知識を導入することによる性能の向上が確認されている．しかし将棋に導入して成功した例は少ない. その理由のひとつに将棋は，ランダムに着手を選択するだけでは終局することが少ないゲームであることがあげられる. そこで本論文ではプレイアウトの着手の選択において完全にランダムに選ぶのではなく，遷移確率を用いることでプレイアウト時の速度を保持しつつ質を向上させ，終局が起こりやすくなることを目指した．またbest-of-nアルゴリズムを用いることにより，さらに終局が起こりやすくなることを目指した．実験の結果ランダムに着手を選択するよりも遷移確率を用いた着手の選択を行うことで終局が起こりやすくなることがわかった．さらにbest-of-n アルゴリズムを用いることでさらに終局が起こりやすくなった．またUCB1 アルゴリズムによる対戦を行った結果，改良を行ったものが97 勝3 敗という結果を残した．

It is understood that the Monte Carlo method is useful in the game program, because of good result in various games and we know that the performance improves using expert knowledge into playout but, the good results are few cases that apply the Monte Carlo method to Shogi. One of the reason is Shogi hardly ends by a sequence of random moves. We improved the playout using transition probability to aim at the useful and fast playout that often ends. And, we aim using best-of-n algorithm at a further improvement. As a result, playout often ends by using transition probability and best-of-n algorithm. And, UCB1 algorithm that uses the playout that we improve matched UCB1 algorithm that doesn't use the playout that we improved. The result is 97 wins, 3 losses.

モンテカルロ将棋における遷移確率を用いたプレイアウトの改良

書誌事項

説明

収録刊行物

詳細情報詳細情報について

書き出し

問題の指摘

モンテカルロ将棋における遷移確率を用いたプレイアウトの改良

書誌事項

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について