eLifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits

被引:0
|
作者
Neu, Gergely [1 ]
Olkhovskaya, Julia [2 ]
Papini, Matteo [1 ]
Schwartz, Ludovic [1 ]
机构
[1] Univ Pompeu Fabra, Barcelona, Spain
[2] Vrije Univ Amsterdam, Amsterdam, Netherlands
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the Bayesian regret of the renowned Thompson Sampling algorithm in contextual bandits with binary losses and adversarially-selected contexts. We adapt the information-theoretic perspective of Russo and Van Roy [2016] to the contextual setting by considering a lifted version of the information ratio defined in terms of the unknown model parameter instead of the optimal action or optimal policy as done in previous works on the same setting. This allows us to bound the regret in terms of the entropy of the prior distribution through a remarkably simple proof, and with no structural assumptions on the likelihood or the prior. The extension to priors with infinite entropy only requires a Lipschitz assumption on the log-likelihood. An interesting special case is that of logistic bandits with d-dimensional parameters, K actions, and Lipschitz logits, for which we provide a (O) over tilde(root dKT) regret upper-bound that does not depend on the smallest slope of the sigmoid link function.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An Information-Theoretic Analysis of Thompson Sampling
    Russo, Daniel
    Van Roy, Benjamin
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [2] Thompson Sampling for Stochastic Bandits with Noisy Contexts: An Information-Theoretic Regret Analysis
    Jose, Sharu Theresa
    Moothedath, Shana
    [J]. ENTROPY, 2024, 26 (07)
  • [3] An Information-Theoretic Analysis for Thompson Sampling with Many Actions
    Dong, Shi
    Van Roy, Benjamin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [4] Information-theoretic analysis of information hiding
    Moulin, P
    O'Sullivan, JA
    [J]. 2000 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2000, : 19 - 19
  • [5] Information-theoretic analysis of information hiding
    Moulin, P
    O'Sullivan, JA
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2003, 49 (03) : 563 - 593
  • [6] AN INFORMATION-THEORETIC APPROACH TO INCORPORATING PRIOR INFORMATION IN BINOMIAL SAMPLING
    DYER, D
    CHIOU, P
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 1984, 13 (17) : 2051 - 2083
  • [7] Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice
    Eldowa, Khaled
    Cesa-Bianchi, Nicolo
    Metelli, Alberto Maria
    Restelli, Marcello
    [J]. 2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 30 - 35
  • [8] Information-theoretic analysis of watermarking
    Moulin, P
    O'Sullivan, JA
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 3630 - 3633
  • [9] An Information-Theoretic Analysis of Deduplication
    Niesen, Urs
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (09) : 5688 - 5704
  • [10] An Information-Theoretic Analysis of Deduplication
    Niesen, Urs
    [J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2017, : 1738 - 1742