Transition-based versus State-based Reward Functions for MDPs with Value-at-Risk

被引:0
|
作者
Ma, Shuai [1 ]
Yu, Jia Yuan [1 ]
机构
[1] Concordia Univ, Fac Engn & Comp Sci, Concordia Inst Informat Syst Engn, 1515 Ste Catherine St West, Montreal, PQ, Canada
关键词
OPTIMIZATION; VARIANCE; CRITERIA; MODELS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In reinforcement learning, the reward function on current state and action is widely used. When the objective is about the expectation of the (discounted) total reward only, it works perfectly. However, if the objective involves the total reward distribution, the result will be wrong. This paper studies Value-at-Risk (VaR) problems in short-and long-horizon Markov decision processes (MDPs) with two reward functions, which share the same expectations. Firstly we show that with VaR objective, when the real reward function is transition-based (with respect to action and both current and next states), the simplified (state-based, with respect to action and current state only) reward function will change the VaR. Secondly, for long-horizon MDPs, we estimate the VaR function with the aid of spectral theory and the central limit theorem. Thirdly, since the estimation method is for a Markov reward process with the reward function on current state only, we present a transformation algorithm for the Markov reward process with the reward function on current and next states, in order to estimate the VaR function with an intact total reward distribution.
引用
收藏
页码:974 / 981
页数:8
相关论文
共 50 条
  • [31] Evaluation of value-at-risk in electricity markets based on multifractal theory
    Wen, F. (fushuan.wen@gmail.com), 1600, Automation of Electric Power Systems Press (37):
  • [32] Fuzzy Portfolio Selection based on Index Tracking and Value-at-Risk
    Xing, Qianli
    Wang, Bo
    Watada, Junzo
    INTELLIGENT DECISION TECHNOLOGIES, 2013, 255 : 225 - 234
  • [33] A Conditional Value-at-Risk Based Inexact Water Allocation Model
    Shao, L. G.
    Qin, X. S.
    Xu, Y.
    WATER RESOURCES MANAGEMENT, 2011, 25 (09) : 2125 - 2145
  • [34] Backtesting Value-at-Risk: A GMM Duration-Based Test
    Candelon, Bertrand
    Colletaz, Gilbert
    Hurlin, Christophe
    Tokpavi, Sessi
    JOURNAL OF FINANCIAL ECONOMETRICS, 2011, 9 (02) : 314 - 343
  • [35] A CONSISTENT ESTIMATOR TO THE ORTHANT-BASED TAIL VALUE-AT-RISK
    Beck, Nicholas
    Mailhot, Melina
    ESAIM-PROBABILITY AND STATISTICS, 2018, 22 : 163 - 177
  • [36] A News-Based Approach for Computing Historical Value-at-Risk
    Hogenboom, Frederik
    de Winter, Michael
    Frasincar, Flavius
    Hogenboom, Alexander
    MANAGEMENT INTELLIGENT SYSTEMS, 2012, 171 : 283 - 292
  • [37] A Value-at-Risk Based Approach for PMU Placement in Distribution Systems
    Liu, Min
    Energy Engineering: Journal of the Association of Energy Engineering, 2022, 119 (02): : 781 - 800
  • [38] A wind power ramp prediction method based on value-at-risk
    He, Yaoyao
    Zhu, Chuang
    Cao, Chaojin
    ENERGY CONVERSION AND MANAGEMENT, 2024, 315
  • [39] Portfolio value-at-risk forecasting with GA-based extreme value theory
    Lin, PC
    Ko, PC
    Chiang, PS
    PROCEEDINGS OF THE 8TH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1-3, 2005, : 1122 - 1125
  • [40] CVAR PROXIES FOR MINIMIZING SCENARIO-BASED VALUE-AT-RISK
    Mausser, Helmut
    Romanko, Oleksandr
    JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2014, 10 (04) : 1109 - 1127