SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引:0
|
作者
Gao, Xiaonan [1 ]
Wu, Ziyi [1 ]
Zhu, Xianchao [1 ]
Cai, Lei [2 ]
机构
[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China
[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;
D O I
10.23952/jnfa.2025.6
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Regularized Soft Actor-Critic for Behavior Transfer Learning
    Tan, Mingxi
    Tian, Andong
    Denoyer, Ludovic
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 516 - 519
  • [32] PAC-Bayesian Soft Actor-Critic Learning
    Tasdighi, Bahareh
    Akgul, Abdullah
    Haussmann, Manuel
    Brink, Kenny Kazimirzak
    Kandemir, Melih
    SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
  • [33] Averaged Soft Actor-Critic for Deep Reinforcement Learning
    Ding, Feng
    Ma, Guanfeng
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    COMPLEXITY, 2021, 2021
  • [34] Actor-Critic Learning Based on Adaptive Importance Sampling
    Cheng Yuhu
    Feng Huanting
    Wang Xuesong
    CHINESE JOURNAL OF ELECTRONICS, 2010, 19 (04): : 583 - 588
  • [35] Actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
  • [36] On actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166
  • [37] Natural Actor-Critic
    Peters, Jan
    Schaal, Stefan
    NEUROCOMPUTING, 2008, 71 (7-9) : 1180 - 1190
  • [38] A supervised Actor-Critic approach for adaptive cruise control
    Zhao, Dongbin
    Wang, Bin
    Liu, Derong
    SOFT COMPUTING, 2013, 17 (11) : 2089 - 2099
  • [39] Autonomous Decision-Making Generation of UAV based on Soft Actor-Critic Algorithm
    Cheng, Yan
    Song, Yong
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7350 - 7355
  • [40] Nash Soft Actor-Critic LEO Satellite Handover Management Algorithm for Flying Vehicles
    Chen, Jinxuan
    Ozger, Mustafa
    Cavdar, Cicek
    2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 380 - 385