A unified algorithm framework for mean-variance optimization in discounted Markov decision processes

被引:1
|
作者
Ma, Shuai [1 ]
Ma, Xiaoteng [2 ]
Xia, Li [1 ]
机构
[1] Sun Yat Sen Univ, Sch Business, Guangzhou 510275, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100086, Peoples R China
基金
中国国家自然科学基金;
关键词
Dynamic programming; Markov decision process; Discounted mean-variance; Bilevel optimization; Bellman local-optimality equation; PORTFOLIO SELECTION; PROSPECT-THEORY; TRADEOFFS; MODEL; RISK;
D O I
10.1016/j.ejor.2023.06.022
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper studies the risk-averse mean-variance optimization in infinite-horizon discounted Markov de-cision processes (MDPs). The involved variance metric concerns reward variability during the whole pro-cess, and future deviations are discounted to their present values. This discounted mean-variance op-timization yields a reward function dependent on a discounted mean, and this dependency renders traditional dynamic programming methods inapplicable since it suppresses a crucial property-time-consistency. To deal with this unorthodox problem, we introduce a pseudo mean to transform the un-treatable MDP to a standard one with a redefined reward function in standard form and derive a dis-counted mean-variance performance difference formula. With the pseudo mean, we propose a unified al-gorithm framework with a bilevel optimization structure for the discounted mean-variance optimization. The framework unifies a variety of algorithms for several variance-related problems, including, but not limited to, risk-averse variance and mean-variance optimizations in discounted and average MDPs. Fur-thermore, the convergence analyses missing from the literature can be complemented with the proposed framework as well. Taking the value iteration as an example, we develop a discounted mean-variance value iteration algorithm and prove its convergence to a local optimum with the aid of a Bellman local-optimality equation. Finally, we conduct a numerical experiment on portfolio management to validate the proposed algorithm.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:1057 / 1067
页数:11
相关论文
共 50 条
  • [41] MARKOV-MODULATED MEAN-VARIANCE PROBLEM FOR AN INSURER
    Wang Wei
    Bi Junna
    [J]. ACTA MATHEMATICA SCIENTIA, 2011, 31 (03) : 1051 - 1061
  • [42] Solving Constrained Mean-Variance Portfolio Optimization Problems Using Spiral Optimization Algorithm
    Febrianti, Werry
    Sidarto, Kuntjoro Adji
    Sumarti, Novriana
    [J]. INTERNATIONAL JOURNAL OF FINANCIAL STUDIES, 2023, 11 (01):
  • [43] Hybrid strategy in multiperiod mean-variance framework
    Xiangyu Cui
    Duan Li
    Yun Shi
    Mingjia Zhu
    [J]. Optimization Letters, 2023, 17 : 493 - 509
  • [44] Divergence of opinion and valuation in a mean-variance framework
    Schnabel, Jacques
    [J]. STUDIES IN ECONOMICS AND FINANCE, 2009, 26 (03) : 148 - +
  • [45] Cryptocurrency-portfolios in a mean-variance framework
    Brauneis, Alexander
    Mestel, Roland
    [J]. FINANCE RESEARCH LETTERS, 2019, 28 : 259 - 264
  • [46] Dynamic asset allocation in a mean-variance framework
    Bajeux-Besnainou, I
    Portait, R
    [J]. MANAGEMENT SCIENCE, 1998, 44 (11) : S79 - S95
  • [47] Risk-Sensitive and Mean Variance Optimality in Markov Decision Processes
    Sladky, Karel
    Sitar, Milan
    [J]. PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS 2008, 2008, : 451 - 459
  • [48] A generalized multi-period mean-variance portfolio optimization with Markov switching parameters
    Costa, Oswaldo L. V.
    Araujo, Michael V.
    [J]. AUTOMATICA, 2008, 44 (10) : 2487 - 2497
  • [50] Hybrid strategy in multiperiod mean-variance framework
    Cui, Xiangyu
    Li, Duan
    Shi, Yun
    Zhu, Mingjia
    [J]. OPTIMIZATION LETTERS, 2023, 17 (02) : 493 - 509