-
- LU Shang
- Department of Informatics, Kyushu University
-
- HATANO Kohei
- Department of Informatics, Kyushu University RIKEN AIP
-
- KIJIMA Shuji
- Faculty of Data Science, Shiga University
-
- TAKIMOTO Eiji
- Department of Informatics, Kyushu University
この論文をさがす
説明
<p>This work introduces the dueling dice problem, which is a variant of the multi-armed dueling bandit problem. A die is a set of m arms in this problem, and the goal is to find the best set of m arms from n arms (m ≤ n) by an iteration of dueling dice. In a round, the learner arbitrarily chooses two dice α ⊆ [n] and β ⊆ [n] and lets them duel, where she roles dice α and β, observes a pair of arms i ∈ α and j ∈ β, and receives a probabilistic result Xi,j ∈ {0, 1}. This paper investigates the sample complexity of an identification of the Condorcet winner die, and gives an upper bound O(nh-2(log log h-1 + log nm2γ-1)m log m) where h is a gap parameter and γ is an error parameter. Our problem is closely related to the dueling teams problem by Cohen et al. 2021. We assume a total order of the strength over arms similarly to Cohen et al. 2021, which ensures the existence of the Condorcet winner die, but we do not assume a total order of the strength over dice unlike Cohen et al. 2021.</p>
収録刊行物
-
- IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
-
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences advpub (0), 2025
一般社団法人 電子情報通信学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390021558788574080
-
- ISSN
- 17451337
- 09168508
-
- 本文言語コード
- en
-
- データソース種別
-
- JaLC
- Crossref
-
- 抄録ライセンスフラグ
- 使用不可