- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem
Search this article
Description
We improved an upper bound on the expected regret of a UCB-type policy LLR for a bandit problem that repeats the following rounds: a player selects a maximal matching on a complete bipartite graph K M , N and receives a reward for each component edge of the selected matching. Rewards are assumed to be generated independently of its previous rewards according to an unknown fixed distribution. Our upper bound is smaller than the best known result (Chen et?al., 2013) by a factor of ? ( M 2 / 3 ) .
Journal
-
- Operations Research Letters
-
Operations Research Letters 43 (6), 558-563, 2015-11
Elsevier BV
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1360848657346734976
-
- ISSN
- 01676377
-
- Article Type
- journal article
-
- Data Source
-
- Crossref
- KAKEN
- OpenAIRE