Mean-Variance and Value at Risk in Multi-Armed Bandit Problems

被引：0

作者：

Vakili, Sattar ^{[1
]}

Zhao, Qing ^{[1
]}

机构：

[1] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14850 USA

来源：

2015 53RD ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON) | 2015年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study risk-averse multi-armed bandit problems under different risk measures. We consider three risk mitigation models. In the first model, the variations in the reward values obtained at different times are considered as risk and the objective is to minimize the mean-variance of the observed rewards. In the second and the third models, the quantity of interest is the total reward at the end of the time horizon, and the objective is to minimize the mean-variance and maximize the value at risk of the total reward, respectively. We develop risk-averse online learning policies and analyze their regret performance. We also provide tight lower bounds on regret under the model of mean-variance of observations.

引用

页码：1330 / 1335

页数：6

共 50 条

[1] Risk-Averse Multi-Armed Bandit Problems Under Mean-Variance Measure
Vakili, Sattar
Zhao, Qing
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2016, 10 (06) : 1093 - 1111
[2] Satisficing in Multi-Armed Bandit Problems
Reverdy, Paul
Srivastava, Vaibhav
Leonard, Naomi Ehrich
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3788 - 3803
[3] Anytime Algorithms for Multi-Armed Bandit Problems
Kleinberg, Robert
PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 928 - 936
[4] Percentile optimization in multi-armed bandit problems
Ghatrani, Zahra
Ghate, Archis
ANNALS OF OPERATIONS RESEARCH, 2024, 340 (2-3) : 837 - 862
[5] Ambiguity aversion in multi-armed bandit problems
Anderson, Christopher M.
THEORY AND DECISION, 2012, 72 (01) : 15 - 33
[6] Multi-armed Bandit Problems with Strategic Arms
Braverman, Mark
Mao, Jieming
Schneider, Jon
Weinberg, S. Matthew
CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
[7] Ambiguity aversion in multi-armed bandit problems
Christopher M. Anderson
Theory and Decision, 2012, 72 : 15 - 33
[8] An algorithm for multi-armed bandit based on variance change sensitivity
Zhu, Canxin
Yang, Jingmin
Zhang, Wenjie
Zheng, Yifeng
ENGINEERING RESEARCH EXPRESS, 2024, 6 (02):
[9] Gaussian multi-armed bandit problems with multiple objectives
Reverdy, Paul
2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 5263 - 5269
[10] Mechanisms with learning for stochastic multi-armed bandit problems
Shweta Jain
Satyanath Bhat
Ganesh Ghalme
Divya Padmanabhan
Y. Narahari
Indian Journal of Pure and Applied Mathematics, 2016, 47 : 229 - 272

← 1 2 3 4 5 →