Mean-Variance and Value at Risk in Multi-Armed Bandit Problems

被引:0
|
作者
Vakili, Sattar [1 ]
Zhao, Qing [1 ]
机构
[1] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14850 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study risk-averse multi-armed bandit problems under different risk measures. We consider three risk mitigation models. In the first model, the variations in the reward values obtained at different times are considered as risk and the objective is to minimize the mean-variance of the observed rewards. In the second and the third models, the quantity of interest is the total reward at the end of the time horizon, and the objective is to minimize the mean-variance and maximize the value at risk of the total reward, respectively. We develop risk-averse online learning policies and analyze their regret performance. We also provide tight lower bounds on regret under the model of mean-variance of observations.
引用
收藏
页码:1330 / 1335
页数:6
相关论文
共 50 条
  • [1] Risk-Averse Multi-Armed Bandit Problems Under Mean-Variance Measure
    Vakili, Sattar
    Zhao, Qing
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2016, 10 (06) : 1093 - 1111
  • [2] Satisficing in Multi-Armed Bandit Problems
    Reverdy, Paul
    Srivastava, Vaibhav
    Leonard, Naomi Ehrich
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3788 - 3803
  • [3] Anytime Algorithms for Multi-Armed Bandit Problems
    Kleinberg, Robert
    PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 928 - 936
  • [4] Percentile optimization in multi-armed bandit problems
    Ghatrani, Zahra
    Ghate, Archis
    ANNALS OF OPERATIONS RESEARCH, 2024, 340 (2-3) : 837 - 862
  • [5] Ambiguity aversion in multi-armed bandit problems
    Anderson, Christopher M.
    THEORY AND DECISION, 2012, 72 (01) : 15 - 33
  • [6] Multi-armed Bandit Problems with Strategic Arms
    Braverman, Mark
    Mao, Jieming
    Schneider, Jon
    Weinberg, S. Matthew
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [7] Ambiguity aversion in multi-armed bandit problems
    Christopher M. Anderson
    Theory and Decision, 2012, 72 : 15 - 33
  • [8] An algorithm for multi-armed bandit based on variance change sensitivity
    Zhu, Canxin
    Yang, Jingmin
    Zhang, Wenjie
    Zheng, Yifeng
    ENGINEERING RESEARCH EXPRESS, 2024, 6 (02):
  • [9] Gaussian multi-armed bandit problems with multiple objectives
    Reverdy, Paul
    2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 5263 - 5269
  • [10] Mechanisms with learning for stochastic multi-armed bandit problems
    Shweta Jain
    Satyanath Bhat
    Ganesh Ghalme
    Divya Padmanabhan
    Y. Narahari
    Indian Journal of Pure and Applied Mathematics, 2016, 47 : 229 - 272