Bayesian decision-making under misspecified priors with applications to meta-learning

被引:0
|
作者
Simchowitz, Max [1 ]
Tosh, Christopher [2 ]
Krishnamurthy, Akshay [3 ]
Hsu, Daniel [2 ]
Lykouris, Thodoris [1 ]
Dudik, Miroslav [3 ]
Schapire, Robert [3 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] Columbia Univ, New York, NY 10027 USA
[3] Microsoft Res NYC, New York, NY USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thompson sampling and other Bayesian sequential decision-making algorithms are among the most popular approaches to tackle explore/exploit trade-offs in (contextual) bandits. The choice of prior in these algorithms offers flexibility to encode domain knowledge but can also lead to poor performance when misspecified. In this paper, we demonstrate that performance degrades gracefully with misspecification. We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most O-similar to(H-2 epsilon) from TS with a well-specified prior, where. is the total-variation distance between priors and H is the learning horizon. Our bound does not require the prior to have any parametric form. For priors with bounded support, our bound is independent of the cardinality or structure of the action space, and we show that it is tight up to universal constants in the worst case. Building on our sensitivity analysis, we establish generic PAC guarantees for algorithms in the recently studied Bayesian meta-learning setting and derive corollaries for various families of priors. Our results generalize along two axes: (1) they apply to a broader family of Bayesian decision-making algorithms, including a MonteCarlo implementation of the knowledge gradient algorithm (KG), and (2) they apply to Bayesian POMDPs, the most general Bayesian decision-making setting, encompassing contextual bandits as a special case. Through numerical simulations, we illustrate how prior misspecification and the deployment of one-step look-ahead (as in KG) can impact the convergence of meta-learning in multi-armed and contextual bandits with structured and correlated priors.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Meta-Learning Hypothesis Spaces for Sequential Decision-making
    Kassraie, Parnian
    Rothfuss, Jonas
    Krause, Andreas
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10802 - 10824
  • [2] Implicit and explicit learning of Bayesian priors differently impacts bias during perceptual decision-making
    Thakur, V. N.
    Basso, M. A.
    Ditterich, J.
    Knowlton, B. J.
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [3] Implicit and explicit learning of Bayesian priors differently impacts bias during perceptual decision-making
    V. N. Thakur
    M. A. Basso
    J. Ditterich
    B. J. Knowlton
    [J]. Scientific Reports, 11
  • [4] Observing the Observer (I): Meta-Bayesian Models of Learning and Decision-Making
    Daunizeau, Jean
    den Ouden, Hanneke E. M.
    Pessiglione, Matthias
    Kiebel, Stefan J.
    Stephan, Klaas E.
    Friston, Karl J.
    [J]. PLOS ONE, 2010, 5 (12):
  • [5] MW-MADDPG: a meta-learning based decision-making method for collaborative UAV swarm
    Zhao, Minrui
    Wang, Gang
    Fu, Qiang
    Guo, Xiangke
    Chen, Yu
    Li, Tengda
    Liu, Xiangyu
    [J]. FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [6] Meta-learning: Bayesian or quantum?
    Mastrogiorgio, Antonio
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 2024, 47
  • [7] Meta-Learning Reliable Priors in the Function Space
    Rothfuss, Jonas
    Heyn, Dominique
    Chen, Jinfan
    Krause, Andreas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [8] Dynamic decision-making under uncertainty: Bayesian learning in environmental game theory
    Zhou, J.
    Petrosian, O. L.
    Gao, H.
    [J]. VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA SERIYA 10 PRIKLADNAYA MATEMATIKA INFORMATIKA PROTSESSY UPRAVLENIYA, 2024, 20 (02):
  • [9] The relative merit of empirical priors in non-identifiable and sloppy models: Applications to models of learning and decision-making Empirical priors
    Spektor, Mikhail S.
    Kellen, David
    [J]. PSYCHONOMIC BULLETIN & REVIEW, 2018, 25 (06) : 2047 - 2068
  • [10] APPLICATIONS OF BAYESIAN METHODS TO OD EVALUATION AND DECISION-MAKING
    SVYANTEK, DJ
    OCONNELL, MS
    BAUMGARDNER, TL
    [J]. HUMAN RELATIONS, 1992, 45 (06) : 621 - 636