Pareto Optimal Model Selection in Linear Bandits

被引:0
|
作者
Zhu, Yinglun [1 ]
Nowak, Robert [1 ]
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study model selection in linear bandits, where the learner must adapt to the dimension (denoted by d(*)) of the smallest hypothesis class containing the true linear model while balancing exploration and exploitation. Previous papers provide various guarantees for this model selection problem, but have limitations; i.e., the analysis requires favorable conditions that allow for inexpensive statistical testing to locate the right hypothesis class or are based on the idea of "corralling" multiple base algorithms, which often performs relatively poorly in practice. These works also mainly focus on upper bounds. In this paper, we establish the first lower bound for the model selection problem. Our lower bound implies that, even with a fixed action set, adaptation to the unknown dimension d, comes at a cost: There is no algorithm that can achieve the regret bound (O) over tilde(root d*T) simultaneously for all values of d(*). We propose Pareto optimal algorithms that match the lower bound. Empirical evaluations show that our algorithm enjoys superior performance compared to existing ones.
引用
收藏
页数:21
相关论文
共 50 条
  • [11] Pareto-Optimal Model Selection via SPRINT-Race
    Zhang, Tiantian
    Georgiopoulos, Michael
    Anagnostopoulos, Georgios C.
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (02) : 596 - 610
  • [12] The Pareto Regret Frontier for Bandits
    Lattimore, Tor
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [13] Model Selection for Generic Contextual Bandits
    Ghosh, Avishek
    Sankararaman, Abishek
    Ramchandran, Kannan
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (01) : 656 - 675
  • [14] Near-Optimal Representation Learning for Linear Bandits and Linear RL
    Hu, Jiachen
    Chen, Xiaoyu
    Jin, Chi
    Li, Lihong
    Wang, Liwei
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [15] Optimal Best-arm Identification in Linear Bandits
    Jedra, Yassir
    Proutiere, Alexandre
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [16] Provably Optimal Algorithms for Generalized Linear Contextual Bandits
    Li, Lihong
    Lu, Yu
    Zhou, Dengyong
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [17] Dynamic Balancing for Model Selection in Bandits and RL
    Cutkosky, Ashok
    Dann, Christoph
    Das, Abhimanyu
    Gentile, Claudio
    Pacchiano, Aldo
    Purohit, Manish
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [18] Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design
    Ruan, Yufei
    Yang, Jiaqi
    Zhou, Yuan
    [J]. STOC '21: PROCEEDINGS OF THE 53RD ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING, 2021, : 74 - 87
  • [19] Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial Corruptions
    He, Jiafan
    Zhou, Dongruo
    Zhang, Tong
    Gu, Quanquan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [20] Distributed Contextual Linear Bandits with Minimax Optimal Communication Cost
    Amani, Sanae
    Lattimore, Tor
    Gyorgy, Andras
    Yang, Lin F.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202 : 691 - 717