SOFT ACTOR-CRITIC ALGORITHM WITH ADAPTIVE NORMALIZATION

被引：0

作者：

Gao, Xiaonan ^{[1
]}

Wu, Ziyi ^{[1
]}

Zhu, Xianchao ^{[1
]}

Cai, Lei ^{[2
]}

机构：

[1] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China

[2] Henan Inst Sci & Technol, Sch Artificial Intelligence, Xinxiang 453003, Peoples R China

来源：

JOURNAL OF NONLINEAR FUNCTIONAL ANALYSIS | 2025年 / 2025卷

基金：

中国国家自然科学基金;

关键词：

Adaptive normalization; Deep reinforcement learning; Reward mechanism; Soft actor-critic algorithm; GAME; GO;

D O I：

10.23952/jnfa.2025.6

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

In recent years, breakthroughs were made in the field of deep reinforcement learning, but, their applications in the real world were seriously affected due to the instability of algorithms and the difficulty in ensuring convergence. As a typical algorithm in reinforcement learning, although the SAC algorithm enhances the robustness and agent's exploration ability by introducing the concept of maximum entropy, it still has the disadvantage of instability in the training process. In order to solve the problems, this paper proposes an Adaptive Normalization-based SAC (AN-SAC) algorithm. By introducing the adaptive normalized reward mechanism into the SAC algorithm, our method can dynamically adjust the normalized parameters of the reward during the training process so that the reward value has zero mean and unit variance. Thus it better adapts to the reward distribution and improves the performance and stability of the algorithm. Experimental results demonstrate that the performance and stability of the AN-SAC algorithm are significantly improved compared with the SAC algorithm.

引用

页数：10

共 50 条

[31] Regularized Soft Actor-Critic for Behavior Transfer Learning
Tan, Mingxi
Tian, Andong
Denoyer, Ludovic
2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 516 - 519
[32] PAC-Bayesian Soft Actor-Critic Learning
Tasdighi, Bahareh
Akgul, Abdullah
Haussmann, Manuel
Brink, Kenny Kazimirzak
Kandemir, Melih
SYMPOSIUM ON ADVANCES IN APPROXIMATE BAYESIAN INFERENCE, 2024, 253 : 127 - 145
[33] Averaged Soft Actor-Critic for Deep Reinforcement Learning
Ding, Feng
Ma, Guanfeng
Chen, Zhikui
Gao, Jing
Li, Peng
COMPLEXITY, 2021, 2021
[34] Actor-Critic Learning Based on Adaptive Importance Sampling
Cheng Yuhu
Feng Huanting
Wang Xuesong
CHINESE JOURNAL OF ELECTRONICS, 2010, 19 (04): : 583 - 588
[35] Actor-critic algorithms
Konda, VR
Tsitsiklis, JN
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
[36] On actor-critic algorithms
Konda, VR
Tsitsiklis, JN
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166
[37] Natural Actor-Critic
Peters, Jan
Schaal, Stefan
NEUROCOMPUTING, 2008, 71 (7-9) : 1180 - 1190
[38] A supervised Actor-Critic approach for adaptive cruise control
Zhao, Dongbin
Wang, Bin
Liu, Derong
SOFT COMPUTING, 2013, 17 (11) : 2089 - 2099
[39] Autonomous Decision-Making Generation of UAV based on Soft Actor-Critic Algorithm
Cheng, Yan
Song, Yong
PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7350 - 7355
[40] Nash Soft Actor-Critic LEO Satellite Handover Management Algorithm for Flying Vehicles
Chen, Jinxuan
Ozger, Mustafa
Cavdar, Cicek
2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 380 - 385

← 1 2 3 4 5 →