Pareto Optimal Model Selection in Linear Bandits

被引:0
|
作者
Zhu, Yinglun [1 ]
Nowak, Robert [1 ]
机构
[1] Univ Wisconsin, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study model selection in linear bandits, where the learner must adapt to the dimension (denoted by d(*)) of the smallest hypothesis class containing the true linear model while balancing exploration and exploitation. Previous papers provide various guarantees for this model selection problem, but have limitations; i.e., the analysis requires favorable conditions that allow for inexpensive statistical testing to locate the right hypothesis class or are based on the idea of "corralling" multiple base algorithms, which often performs relatively poorly in practice. These works also mainly focus on upper bounds. In this paper, we establish the first lower bound for the model selection problem. Our lower bound implies that, even with a fixed action set, adaptation to the unknown dimension d, comes at a cost: There is no algorithm that can achieve the regret bound (O) over tilde(root d*T) simultaneously for all values of d(*). We propose Pareto optimal algorithms that match the lower bound. Empirical evaluations show that our algorithm enjoys superior performance compared to existing ones.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] The Pareto Frontier of model selection for general Contextual Bandits
    Marinov, Teodor
    Zimmert, Julian
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [2] Near Instance Optimal Model Selection for Pure Exploration Linear Bandits
    Zhu, Yinglun
    Katz-Samuels, Julian
    Nowak, Robert
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [3] Anytime Model Selection in Linear Bandits
    Kassraie, Parnian
    Emmenegger, Nicolas
    Krause, Andreas
    Pacchiano, Aldo
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Problem-Complexity Adaptive Model Selection for Stochastic Linear Bandits
    Ghosh, Avishek
    Sankararaman, Abishek
    Ramchandran, Kannan
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [5] Model selection for contextual bandits
    Foster, Dylan J.
    Krishnamurthy, Akshay
    Luo, Haipeng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] THE LINEAR UTILITY MODEL FOR OPTIMAL SELECTION
    MELLENBERGH, GJ
    VANDERLINDEN, WJ
    [J]. PSYCHOMETRIKA, 1981, 46 (03) : 283 - 293
  • [7] Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits
    Cheng, Ji
    Xue, Bo
    Yi, Jiaxiang
    Zhang, Qingfu
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11489 - 11497
  • [8] Universal and data-adaptive algorithms for model selection in linear contextual bandits
    Muthukumar, Vidya
    Krishnamurthy, Akshay
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] Feature Selection in Distributed Stochastic Linear Bandits
    Lin, Jiabin
    Moothedath, Shana
    [J]. 2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 3939 - 3944
  • [10] Feature and Parameter Selection in Stochastic Linear Bandits
    Moradipari, Ahmadreza
    Turan, Berkay
    Abbasi-Yadkori, Yasin
    Alizadeh, Mahnoosh
    Ghavamzadeh, Mohammad
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,