Identification of the Strongest Die in Dueling Bandits

LU Shang, HATANO Kohei, KIJIMA Shuji, TAKIMOTO Eiji

doi:10.1587/transfun.2024eap1078

説明

<p>This work introduces the dueling dice problem, which is a variant of the multi-armed dueling bandit problem. A die is a set of m arms in this problem, and the goal is to find the best set of m arms from n arms (m ≤ n) by an iteration of dueling dice. In a round, the learner arbitrarily chooses two dice α ⊆ [n] and β ⊆ [n] and lets them duel, where she roles dice α and β, observes a pair of arms i ∈ α and j ∈ β, and receives a probabilistic result X_i,j ∈ {0, 1}. This paper investigates the sample complexity of an identification of the Condorcet winner die, and gives an upper bound O(nh^-2(log log h^-1 + log nm²γ^-1)m log m) where h is a gap parameter and γ is an error parameter. Our problem is closely related to the dueling teams problem by Cohen et al. 2021. We assume a total order of the strength over arms similarly to Cohen et al. 2021, which ensures the existence of the Condorcet winner die, but we do not assume a total order of the strength over dice unlike Cohen et al. 2021.</p>

収録刊行物

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences advpub (0), 2025

一般社団法人電子情報通信学会

詳細情報詳細情報について

CRID: 1390021558788574080

DOI: 10.1587/transfun.2024eap1078

ISSN: 17451337; 09168508

Web Site: https://www.jstage.jst.go.jp/article/transfun/advpub/0/advpub_2024EAP1078/_pdf

本文言語コード: en

データソース種別

JaLC
Crossref

抄録ライセンスフラグ: 使用不可

書き出し

問題の指摘

Identification of the Strongest Die in Dueling Bandits

この論文をさがす

説明

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Identification of the Strongest Die in Dueling Bandits

この論文をさがす

説明

収録刊行物

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について