Supervised actor-critic reinforcement learning with action feedback for algorithmic trading

被引：5

作者：

Sun, Qizhou ^{[1
]}

Si, Yain-Whar ^{[1
]}

机构：

[1] Univ Macau, Dept Comp & Informat Sci, Ave da Univ, Taipa, Macau, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 13期

关键词：

Finance; Reinforcement learning; Supervised learning; Algorithmic trading; ENERGY;

D O I：

10.1007/s10489-022-04322-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning is one of the promising approaches for algorithmic trading in financial markets. However, in certain situations, buy or sell orders issued by an algorithmic trading program may not be fulfilled entirely. By considering the actual scenarios from the financial markets, in this paper, we propose a novel framework named Supervised Actor-Critic Reinforcement Learning with Action Feedback (SACRL-AF) for solving this problem. The action feedback mechanism of SACRL-AF notifies the actor about the dealt positions and corrects the transitions of the replay buffer. Meanwhile, the dealt positions are used as the labels for the supervised learning. Recent studies have shown that Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic Policy Gradient (TD3) are more stable and superior to other actor-critic algorithms. Against this background, based on the proposed SACRL-AF framework, two reinforcement learning algorithms henceforth referred to as Supervised Deep Deterministic Policy Gradient with Action Feedback (SDDPG-AF) and Supervised Twin Delayed Deep Deterministic Policy Gradient with Action Feedback (STD3-AF) are proposed in this paper. Experimental results show that SDDPG-AF and STD3-AF achieve the state-of-art performance in profitability.

引用

页码：16875 / 16892

页数：18

共 50 条

[1] Supervised actor-critic reinforcement learning with action feedback for algorithmic trading
Qizhou Sun
Yain-Whar Si
Applied Intelligence, 2023, 53 : 16875 - 16892
[2] Actor-critic reinforcement learning for the feedback control of a swinging chain
Dengler, C.
Lohmann, B.
IFAC PAPERSONLINE, 2018, 51 (13): : 378 - 383
[3] Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space
Fan, Zhou
Su, Rui
Zhang, Weinan
Yu, Yong
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2279 - 2285
[4] A World Model for Actor-Critic in Reinforcement Learning
Panov, A. I.
Ugadiarov, L. A.
PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
[5] Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi
Mohan, Avinash
Gopalan, Aditya
Mannor, Shie
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[6] Curious Hierarchical Actor-Critic Reinforcement Learning
Roeder, Frank
Eppe, Manfred
Nguyen, Phuong D. H.
Wermter, Stefan
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
[7] Integrated Actor-Critic for Deep Reinforcement Learning
Zheng, Jiaohao
Kurt, Mehmet Necip
Wang, Xiaodong
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 505 - 518
[8] A fuzzy Actor-Critic reinforcement learning network
Wang, Xue-Song
Cheng, Yu-Hu
Yi, Jian-Qiang
INFORMATION SCIENCES, 2007, 177 (18) : 3764 - 3781
[9] A modified actor-critic reinforcement learning algorithm
Mustapha, SM
Lachiver, G
2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
[10] Research on actor-critic reinforcement learning in RoboCup
Guo, He
Liu, Tianying
Wang, Yuxin
Chen, Feng
Fan, Jianming
WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 205 - 205

← 1 2 3 4 5 →