Hierarchical Bayesian Bandits

被引:0
|
作者
Hong, Joey [1 ,4 ]
Kveton, Branislav [2 ,4 ]
Zaheer, Manzil [3 ]
Ghavamzadeh, Mohammad [4 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Amazon, Seattle, WA USA
[3] Google DeepMind, Mountain View, CA 94043 USA
[4] Google Res, Mountain View, CA 94043 USA
关键词
ALLOCATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Meta-, multi-task, and federated learning can be all viewed as solving similar tasks, drawn from a distribution that reflects task similarities. We provide a unified view of all these problems, as learning to act in a hierarchical Bayesian bandit. We propose and analyze a natural hierarchical Thompson sampling algorithm (HierTS) for this class of problems. Our regret bounds hold for many variants of the problems, including when the tasks are solved sequentially or in parallel; and show that the regret decreases with a more informative prior. Our proofs rely on a novel total variance decomposition that can be applied beyond our models. Our theory is complemented by experiments, which show that the hierarchy helps with knowledge sharing among the tasks. This confirms that hierarchical Bayesian bandits are a universal and statistically-efficient tool for learning to act with similar bandit tasks.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models
    Wan, Runzhe
    Ge, Lin
    Song, Rui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] BAYESIAN NONPARAMETRIC BANDITS
    CLAYTON, MK
    BERRY, DA
    ANNALS OF STATISTICS, 1985, 13 (04): : 1523 - 1534
  • [3] Hierarchical Unimodal Bandits
    Zhao, Tianchi
    Zhang, Chicheng
    Li, Ming
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 269 - 283
  • [4] Bayesian Algorithms for Decentralized Stochastic Bandits
    Lalitha A.
    Goldsmith A.
    IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 564 - 583
  • [5] A Bayesian Approach for Subset Selection in Contextual Bandits
    Li, Jialian
    Du, Chao
    Zhu, Jun
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8384 - 8391
  • [6] Bayesian Contextual Bandits for Hyper Parameter Optimization
    Sui, Guoxin
    Yu, Yong
    IEEE ACCESS, 2020, 8 : 42971 - 42979
  • [7] Efficient Online Bayesian Inference for Neural Bandits
    Duran-Martin, Gerardo
    Kara, Aleyna
    Murphy, Kevin
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 6002 - 6021
  • [8] Continuous-in-time Limit for Bayesian Bandits
    Zhu, Yuhua
    Izzo, Zachary
    Ying, Lexing
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [9] Continuous-in-time Limit for Bayesian Bandits
    Zhu, Yuhua
    Izzo, Zach
    Ying, Lexing
    arXiv, 2022,
  • [10] Bayesian Contextual Bandits for Hyper Parameter Optimization
    Sui, Guoxin
    Yu, Yong
    IEEE Access, 2020, 8 : 42971 - 42979