Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market

被引:4
|
作者
Wu, Bo [1 ]
Li, Lingfei [1 ]
机构
[1] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China
来源
JOURNAL OF ECONOMIC DYNAMICS & CONTROL | 2024年 / 158卷
关键词
Reinforcement learning; Actor-critic; Mean-variance; Portfolio selection; Partial information; Regime-switching; Wonham's filter; ASSET ALLOCATION; OPTIMIZATION;
D O I
10.1016/j.jedc.2023.104787
中图分类号
F [经济];
学科分类号
02 ;
摘要
We propose a reinforcement learning (RL) approach to solve the continuous-time mean-variance portfolio selection problem in a regime-switching market, where the market regime is unobservable. To encourage exploration for learning, we formulate an exploratory stochastic control problem with an entropy-regularized mean-variance objective. We obtain semi-analytical representations of the optimal value function and optimal policy, which involve unknown solutions to two linear parabolic partial differential equations (PDEs). We utilize these representations to parametrize the value function and policy for learning with the unknown solutions to the PDEs approximated based on polynomials. We develop an actor-critic RL algorithm to learn the optimal policy through interactions with the market environment. The algorithm carries out filtering to obtain the belief probability of the market regime and performs policy evaluation and policy gradient updates alternately. Empirical results demonstrate the advantages of our RL algorithm in relatively long-term investment problems over the classical control approach and an RL algorithm developed for the continuous-time mean-variance problem without considering regime switches.
引用
收藏
页数:28
相关论文
共 50 条
  • [21] Mean-variance portfolio selection under a non-Markovian regime-switching model
    Wang, Tianxiao
    Wei, Jiaqin
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2019, 350 : 442 - 455
  • [22] Markowitz's mean-variance portfolio selection with regime switching: From discrete-time models to their continuous-time limits
    Yin, G
    Zhou, XY
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (03) : 349 - 360
  • [23] Achieving Mean-Variance Efficiency by Continuous-Time Reinforcement Learning
    Huang, Yilie
    Jia, Yanwei
    Zhou, Xunyu
    Proceedings of the 3rd ACM International Conference on AI in Finance, ICAIF 2022, 2022, : 377 - 385
  • [24] Continuous-time mean-variance portfolio selection with only risky assets
    Yao, Haixiang
    Li, Zhongfei
    Chen, Shumin
    ECONOMIC MODELLING, 2014, 36 : 244 - 251
  • [25] Continuous-Time Mean-Variance Portfolio Selection under the CEV Process
    Ma, Hui-qiang
    ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [26] Achieving Mean-Variance Efficiency by Continuous-Time Reinforcement Learning
    Huang, Yilie
    Jia, Yanwei
    Zhou, Xun Yu
    3RD ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2022, 2022, : 377 - 385
  • [27] Continuous-Time Mean-Variance Portfolio Selection: A Stochastic LQ Framework
    X. Y. Zhou
    D. Li
    Applied Mathematics & Optimization, 2000, 42 : 19 - 33
  • [28] Continuous-time mean-variance portfolio selection: A stochastic LQ framework
    Zhou, XY
    Li, D
    APPLIED MATHEMATICS AND OPTIMIZATION, 2000, 42 (01): : 19 - 33
  • [29] Log Mean-Variance Portfolio Selection Under Regime Switching
    Ishijima H.
    Uchida M.
    Asia-Pacific Financial Markets, 2011, 18 (2) : 213 - 229
  • [30] Continuous-time mean-variance portfolio optimization in a jump-diffusion market
    Alp O.S.
    Korn R.
    Decisions in Economics and Finance, 2011, 34 (1) : 21 - 40