Mean-Variance and Value at Risk in Multi-Armed Bandit Problems

被引：0

作者：

Vakili, Sattar ^{[1
]}

Zhao, Qing ^{[1
]}

机构：

[1] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14850 USA

来源：

2015 53RD ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON) | 2015年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study risk-averse multi-armed bandit problems under different risk measures. We consider three risk mitigation models. In the first model, the variations in the reward values obtained at different times are considered as risk and the objective is to minimize the mean-variance of the observed rewards. In the second and the third models, the quantity of interest is the total reward at the end of the time horizon, and the objective is to minimize the mean-variance and maximize the value at risk of the total reward, respectively. We develop risk-averse online learning policies and analyze their regret performance. We also provide tight lower bounds on regret under the model of mean-variance of observations.

引用

页码：1330 / 1335

页数：6

共 50 条

[31] Dynamic Multi-Armed Bandit with Covariates
Pavlidis, Nicos G.
Tasoulis, Dimitris K.
Adams, Niall M.
Hand, David J.
ECAI 2008, PROCEEDINGS, 2008, 178 : 777 - +
[32] Scaling Multi-Armed Bandit Algorithms
Fouche, Edouard
Komiyama, Junpei
Boehm, Klemens
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1449 - 1459
[33] The budgeted multi-armed bandit problem
Madani, O
Lizotte, DJ
Greiner, R
LEARNING THEORY, PROCEEDINGS, 2004, 3120 : 643 - 645
[34] The Multi-Armed Bandit With Stochastic Plays
Lesage-Landry, Antoine
Taylor, Joshua A.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (07) : 2280 - 2286
[35] Multi-armed Bandit with Additional Observations
Yun, Donggyu
Proutiere, Alexandre
Ahn, Sumyeong
Shin, Jinwoo
Yi, Yung
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2018, 2 (01)
[36] IMPROVING STRATEGIES FOR THE MULTI-ARMED BANDIT
POHLENZ, S
MARKOV PROCESS AND CONTROL THEORY, 1989, 54 : 158 - 163
[37] MULTI-ARMED BANDIT ALLOCATION INDEXES
JONES, PW
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1989, 40 (12) : 1158 - 1159
[38] THE MULTI-ARMED BANDIT PROBLEM WITH COVARIATES
Perchet, Vianney
Rigollet, Philippe
ANNALS OF STATISTICS, 2013, 41 (02): : 693 - 721
[39] The Multi-fidelity Multi-armed Bandit
Kandasamy, Kirthevasan
Dasarathy, Gautam
Schneider, Jeff
Poczos, Barnabas
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[40] Multi-armed Bandit with Additional Observations
Yun D.
Ahn S.
Proutiere A.
Shin J.
Yi Y.
2018, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (46): : 53 - 55

← 1 2 3 4 5 →