No-Regret Algorithms for Heavy-Tailed Linear Bandits

被引：0

作者：

Medina, Andres Munoz ^{[1
]}

Yang, Scott ^{[2
]}

机构：

[1] Google Res, 111 8th Av, New York, NY 10011 USA

[2] Courant Inst, 251 Mercer St, New York, NY 10012 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48 | 2016年 / 48卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We analyze the problem of linear bandits under heavy tailed noise. Most of the work on linear bandits has been based on the assumption of bounded or sub-Gaussian noise. This assumption however is often violated in common scenarios such as financial markets. We present two algorithms to tackle this problem: one based on dynamic truncation and one based on a median of means estimator. We show that, when the noise admits only a 1 + epsilon moment, these algorithms are still able to achieve regret in (O) over tilde (T2+epsilon 2(1+epsilon)) and (O) over tilde (T1+2 epsilon/1+3 epsilon) respectively. In particular, they guarantee sublinear regret as long as the noise has finite variance. We also present empirical results showing that our algorithms achieve a better performance than the current state of the art for bounded noise when the L-infinity bound on the noise is large yet the 1+epsilon moment of the noise is small.

引用

页数：9

共 50 条

[1] No-Regret Reinforcement Learning with Heavy-Tailed Rewards
Zhuang, Vincent
Sui, Yanan
[J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[2] Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs
Zhong, Han
Huang, Jiayi
Yang, Lin F.
Wang, Liwei
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[3] Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs
Xue, Bo
Wang, Guanghui
Wang, Yimu
Zhang, Lijun
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 2936 - 2942
[4] Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards
Xue, Bo
Wang, Yimu
Wan, Yuanyu
Yi, Jinfeng
Zhang, Lijun
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[5] Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs
Shao, Han
Yu, Xiaotian
King, Irwin
Lyu, Michael R.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[6] Robust Heavy-Tailed Linear Bandits Algorithm
Ma L.
Zhao P.
Zhou Z.
[J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (06): : 1385 - 1395
[7] Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards
Lu, Shiyin
Wang, Guanghui
Hu, Yao
Zhang, Lijun
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[8] No-Regret Linear Bandits beyond Realizability
Liu, Chong
Yin, Ming
Wang, Yu-Xiang
[J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1294 - 1303
[9] Minimax Policy for Heavy-Tailed Bandits
Wei, Lai
Srivastava, Vaibhav
[J]. IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (04): : 1423 - 1428
[10] Minimax Policy for Heavy-tailed Bandits
Wei, Lai
Srivastava, Vaibhav
[J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1155 - 1160

← 1 2 3 4 5 →