Multi-step reward ensemble methods for adaptive stock trading

被引:0
|
作者
Zeng, Zhiyi [1 ]
Ma, Cong [2 ]
Chang, Xiangyu [3 ]
机构
[1] Hubei Normal Univ, Sch Math & Stat, Huangshi, Peoples R China
[2] Northwest Univ, Sch Econ & Management, Xian, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Management, Ctr Intelligent Decis Making & Machine Learning, Xian, Peoples R China
关键词
Multi-step reward; Reward ensemble; Adaptive trading; Thompson sampling; VOLATILITY; RETURNS; RULES;
D O I
10.1016/j.eswa.2023.120547
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stock trading can be considered a Markov decision process that comes naturally to applying reinforcement learning (RL) to this field. Numerous studies have proposed various methods to combine stock trading with RL, where only one single reward function is used to fit the market. However, the market in the real world shows distinct patterns in different periods, such as bullish or bearish. A reward function in bullish periods may perform poorly in bearish periods. In our work, we construct several kinds of multi-step future-price-based reward functions (profit-based reward and regularized-based reward), considering that the market changes consistently. Moreover, we propose two ensemble rewards based on the greedy method (MSR-GME, the abbreviation for Multi-Step Rewards Greedy Method Ensemble) and Thompson sampling (MSR-TSE, the abbreviation for Multi-Step Rewards Thompson Sampling Ensemble) to help agents to make adaptive trading decisions under distinct market patterns. We conduct extensive experiments to verify the mechanisms and the superiority of our constructed reward functions from multiple aspects. The results show the two constructed single-reward functions outperform both the buy-and-hold strategy (B & H) and the historical-price-based rewards consistently to a large extent (for example, the profit-based reward achieves at most 7.3 times the Sortino ratio and 78.6% lower maximum drawdown than B & H). Moreover, the ensemble rewards can substantially improve strategy performance in achieving higher profits and lower risks (for example, MSR-TSE achieves at most 49.7 times profits and 8.85 times Sortino ratio than B & H). We also find that MSR-TSE is risk-averse, but MSR-GME is risk-aggressive, indicating that Thompson sampling is an intensely competitive ensemble method, especially in bearish markets.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] Multi-step forecasting for big data time series based on ensemble learning
    Galicia, A.
    Talavera-Llames, R.
    Troncoso, A.
    Koprinska, I.
    Martinez-Alvarez, F.
    KNOWLEDGE-BASED SYSTEMS, 2019, 163 : 830 - 841
  • [42] Some multi-step iterative methods for solving nonlinear equations
    Rafiq, Arif
    Rafiullah, Muhammad
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 58 (08) : 1589 - 1597
  • [43] Multi-step Maruyama methods for stochastic delay differential equations
    Buckwar, Evelyn
    Winkler, Renate
    STOCHASTIC ANALYSIS AND APPLICATIONS, 2007, 25 (05) : 933 - 959
  • [44] Multi-step methods for random ODEs driven by Ito diffusions
    Asai, Y.
    Kloeden, P. E.
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2016, 294 : 210 - 224
  • [45] Some multi-step iterative methods for solving nonlinear equations
    Department of Mathematics, COMSATS Institute of Information Technology, Lahore, Pakistan
    Comput Math Appl, 8 (1589-1597):
  • [46] Multi-step nonlinear conjugate gradient methods for unconstrained minimization
    John A. Ford
    Yasushi Narushima
    Hiroshi Yabe
    Computational Optimization and Applications, 2008, 40 : 191 - 216
  • [47] MULTI-STEP METHODS FOR MACHINE LEARNING MODELS WITH WEB METRICS
    Popchev, Ivan
    Orozova, Daniela
    COMPTES RENDUS DE L ACADEMIE BULGARE DES SCIENCES, 2023, 76 (11): : 1707 - 1715
  • [48] Semilocal Convergence of a Multi-Step Parametric Family of Iterative Methods
    Villalba, Eva G.
    Martinez, Eulalia
    Triguero-Navarro, Paula
    SYMMETRY-BASEL, 2023, 15 (02):
  • [49] Higher-order fractional linear multi-step methods
    Marasi, H. R.
    Derakhshan, M. H.
    Joujehi, A. Soltani
    Kumar, Pushpendra
    PHYSICA SCRIPTA, 2023, 98 (02)
  • [50] Analyzing chaotic systems with multi-step methods: Theory and simulations
    Belhamiti, Meriem Mansouria
    Dahmani, Zoubir
    Alzabut, Jehad
    Almutairi, D.K.
    Khan, Hasib
    Alexandria Engineering Journal, 2025, 113 : 516 - 534