Hierarchical Bayesian Bandits

被引：0

作者：

Hong, Joey ^{[1
,4
]}

Kveton, Branislav ^{[2
,4
]}

Zaheer, Manzil ^{[3
]}

Ghavamzadeh, Mohammad ^{[4
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Amazon, Seattle, WA USA

[3] Google DeepMind, Mountain View, CA 94043 USA

[4] Google Res, Mountain View, CA 94043 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151 | 2022年 / 151卷

关键词：

ALLOCATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Meta-, multi-task, and federated learning can be all viewed as solving similar tasks, drawn from a distribution that reflects task similarities. We provide a unified view of all these problems, as learning to act in a hierarchical Bayesian bandit. We propose and analyze a natural hierarchical Thompson sampling algorithm (HierTS) for this class of problems. Our regret bounds hold for many variants of the problems, including when the tasks are solved sequentially or in parallel; and show that the regret decreases with a more informative prior. Our proofs rely on a novel total variance decomposition that can be applied beyond our models. Our theory is complemented by experiments, which show that the hierarchy helps with knowledge sharing among the tasks. This confirms that hierarchical Bayesian bandits are a universal and statistically-efficient tool for learning to act with similar bandit tasks.

引用

页数：18

共 50 条

[1] Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
Wan, Runzhe
Ge, Lin
Song, Rui
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[2] BAYESIAN NONPARAMETRIC BANDITS
CLAYTON, MK
BERRY, DA
ANNALS OF STATISTICS, 1985, 13 (04): : 1523 - 1534
[3] Hierarchical Unimodal Bandits
Zhao, Tianchi
Zhang, Chicheng
Li, Ming
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 269 - 283
[4] Bayesian Algorithms for Decentralized Stochastic Bandits
Lalitha A.
Goldsmith A.
IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 564 - 583
[5] A Bayesian Approach for Subset Selection in Contextual Bandits
Li, Jialian
Du, Chao
Zhu, Jun
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8384 - 8391
[6] Bayesian Contextual Bandits for Hyper Parameter Optimization
Sui, Guoxin
Yu, Yong
IEEE ACCESS, 2020, 8 : 42971 - 42979
[7] Efficient Online Bayesian Inference for Neural Bandits
Duran-Martin, Gerardo
Kara, Aleyna
Murphy, Kevin
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 6002 - 6021
[8] Continuous-in-time Limit for Bayesian Bandits
Zhu, Yuhua
Izzo, Zachary
Ying, Lexing
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[9] Continuous-in-time Limit for Bayesian Bandits
Zhu, Yuhua
Izzo, Zach
Ying, Lexing
arXiv, 2022,
[10] Bayesian Contextual Bandits for Hyper Parameter Optimization
Sui, Guoxin
Yu, Yong
IEEE Access, 2020, 8 : 42971 - 42979

← 1 2 3 4 5 →