Pareto Optimal Model Selection in Linear Bandits

被引：0

作者：

Zhu, Yinglun ^{[1
]}

Nowak, Robert ^{[1
]}

机构：

[1] Univ Wisconsin, Madison, WI 53706 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151 | 2022年 / 151卷

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study model selection in linear bandits, where the learner must adapt to the dimension (denoted by d(*)) of the smallest hypothesis class containing the true linear model while balancing exploration and exploitation. Previous papers provide various guarantees for this model selection problem, but have limitations; i.e., the analysis requires favorable conditions that allow for inexpensive statistical testing to locate the right hypothesis class or are based on the idea of "corralling" multiple base algorithms, which often performs relatively poorly in practice. These works also mainly focus on upper bounds. In this paper, we establish the first lower bound for the model selection problem. Our lower bound implies that, even with a fixed action set, adaptation to the unknown dimension d, comes at a cost: There is no algorithm that can achieve the regret bound (O) over tilde(root d*T) simultaneously for all values of d(*). We propose Pareto optimal algorithms that match the lower bound. Empirical evaluations show that our algorithm enjoys superior performance compared to existing ones.

引用

页数：21

共 50 条

[1] The Pareto Frontier of model selection for general Contextual Bandits
Marinov, Teodor
Zimmert, Julian
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
[2] Near Instance Optimal Model Selection for Pure Exploration Linear Bandits
Zhu, Yinglun
Katz-Samuels, Julian
Nowak, Robert
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[3] Anytime Model Selection in Linear Bandits
Kassraie, Parnian
Emmenegger, Nicolas
Krause, Andreas
Pacchiano, Aldo
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[4] Problem-Complexity Adaptive Model Selection for Stochastic Linear Bandits
Ghosh, Avishek
Sankararaman, Abishek
Ramchandran, Kannan
[J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[5] Model selection for contextual bandits
Foster, Dylan J.
Krishnamurthy, Akshay
Luo, Haipeng
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] THE LINEAR UTILITY MODEL FOR OPTIMAL SELECTION
MELLENBERGH, GJ
VANDERLINDEN, WJ
[J]. PSYCHOMETRIKA, 1981, 46 (03) : 283 - 293
[7] Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits
Cheng, Ji
Xue, Bo
Yi, Jiaxiang
Zhang, Qingfu
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11489 - 11497
[8] Universal and data-adaptive algorithms for model selection in linear contextual bandits
Muthukumar, Vidya
Krishnamurthy, Akshay
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[9] Feature Selection in Distributed Stochastic Linear Bandits
Lin, Jiabin
Moothedath, Shana
[J]. 2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 3939 - 3944
[10] Feature and Parameter Selection in Stochastic Linear Bandits
Moradipari, Ahmadreza
Turan, Berkay
Abbasi-Yadkori, Yasin
Alizadeh, Mahnoosh
Ghavamzadeh, Mohammad
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →