Quantum greedy algorithms for multi-armed bandits

被引：0

作者：

Ohno, Hiroshi ^{[1
]}

机构：

[1] Toyota Cent Res & Dev Labs Inc, 41-1 Yokomichi, Nagakute, Aichi 4801192, Japan

来源：

QUANTUM INFORMATION PROCESSING | 2023年 / 22卷 / 02期

关键词：

Multi-armed bandits; -greedy algorithm; MovieLens dataset; Quantum amplitude amplification; Regret analysis;

D O I：

10.1007/s11128-023-03844-2

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

Multi-armed bandits are widely used in machine learning applications such as recommendation systems. Here, we implement two quantum versions of the e-greedy algorithm, a popular algorithm for multi-armed bandits. One of the quantum greedy algorithms uses a quantum maximization algorithm and the other is a simple algorithm that uses an amplitude encoding method as a quantum subroutine instead of the argmax operation in the e-greedy algorithm. For the former algorithm, given a quantum oracle, the query complexity is on the order root K (O(root K)) in each round, where K is the number of arms. For the latter algorithm, quantum parallelism is achieved by the quantum superposition of the arms and the run-time complexity is on the order O(K)/O(log K) in each round. Bernoulli reward distributions and the MovieLens dataset are used to evaluate the algorithms with their classical counterparts. The experimental results show that for best arm identification, the performance of the quantum greedy algorithm is comparable with that of the classical counterparts.

引用

页数：20

共 50 条

[41] Secure Outsourcing of Multi-Armed Bandits
Ciucanu, Radu
Lafourcade, Pascal
Lombard-Platet, Marius
Soare, Marta
2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 202 - 209
[42] Decentralized Exploration in Multi-Armed Bandits
Feraud, Raphael
Alami, Reda
Laroche, Romain
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[43] Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards
Lee, Kyungjae
Yang, Hongjun
Lim, Sungbin
Oh, Songhwai
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[44] Multi-armed bandits with episode context
Rosin, Christopher D.
ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2011, 61 (03) : 203 - 230
[45] Introduction to Multi-Armed Bandits Preface
Slivkins, Aleksandrs
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (1-2): : 1 - 286
[46] Federated Multi-armed Bandits with Personalization
Shi, Chengshuai
Shen, Cong
Yang, Jing
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[47] The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms
Bayati, Mohsen
Hamidi, Nima
Johari, Ramesh
Khosravi, Khashayar
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[48] LEVY BANDITS: MULTI-ARMED BANDITS DRIVEN BY LEVY PROCESSES
Kaspi, Haya
Mandelbaum, Avi
ANNALS OF APPLIED PROBABILITY, 1995, 5 (02): : 541 - 565
[49] Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits
Carpentier, Alexandra
Lazaric, Alessandro
Ghavamzadeh, Mohammad
Munos, Remi
Auer, Peter
ALGORITHMIC LEARNING THEORY, 2011, 6925 : 189 - +
[50] Beyond the hazard rate: More perturbation algorithms for adversarial multi-armed bandits
2018, Microtome Publishing (18)

← 1 2 3 4 5 →