Improved SARSA and DQN algorithms for reinforcement learning

被引：0

作者：

Yao, Guangyu ^{[1
,2
]}

Zhang, Nan ^{[1
,2
]}

Duan, Zhenhua ^{[1
,2
]}

Tian, Cong ^{[1
,2
]}

机构：

[1] Xidian Univ, Inst Comp Theory & Technol, Xian 710071, Peoples R China

[2] Xidian Univ, ISN Lab, Xian 710071, Peoples R China

来源：

THEORETICAL COMPUTER SCIENCE | 2025年 / 1027卷

关键词：

Machine learning; Reinforcement learning; Deep Q-network; epsilon-greedy policy; Value iteration;

D O I：

10.1016/j.tcs.2024.115025

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Reinforcement learning is a branch of machine learning in which an agent interacts with an environment to learn optimal actions that maximize cumulative rewards. This paper aims to enhance the SARSA and DQN algorithms in four key aspects: the epsilon-greedy policy, reward function, value iteration approach, and sampling probability. The experiments are conducted in three scenarios: path planning, CartPole, and MountainCar. The results show that, in these environments, the improved algorithms exhibit better convergence, higher rewards, and more stable training processes.

引用

页数：15

共 50 条

[21] Analysis of Space Manipulator Route Planning Based on Sarsa (λ) Reinforcement Learning
Xu
Lu S.
Yuhang Xuebao/Journal of Astronautics, 2019, 40 (04): : 435 - 443
[22] Model Predictive Control-Based Reinforcement Learning Using Expected Sarsa
Moradimaryamnegari, Hoomaan
Frego, Marco
Peer, Angelika
IEEE ACCESS, 2022, 10 : 81177 - 81191
[23] A Sarsa reinforcement learning hybrid ensemble method for robotic battery power forecasting
Peng, Fei
Liu, Hui
Zheng, Li
JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2023, 30 (11) : 3867 - 3880
[24] Applying Double DQN to Reinforcement learning of Automated Designing ICT System
Okamura, Natsuki
Yakuwa, Yutaka
Kuroda, Takayuki
Yairi, Ikuko E.
IEICE COMMUNICATIONS EXPRESS, 2022, 11 (10): : 667 - 672
[25] CT-DQN: Control-Tutored Deep Reinforcement Learning
De Lellis, Francesco
Coraggio, Marco
Russo, Giovanni
Musolesi, Mirco
di Bernardo, Mario
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[26] Applying Double DQN to Reinforcement learning of Automated Designing ICT System
Okamura, Natsuki
Yakuwa, Yutaka
Kuroda, Takayuki
Yairi, Ikuko E.
IEICE COMMUNICATIONS EXPRESS, 2022,
[27] Safe Reinforcement Learning for Single Train Trajectory Optimization via Shield SARSA
Zhao, Zicong
Xun, Jing
Wen, Xuguang
Chen, Jianqiu
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (01) : 412 - 428
[28] APF-DQN: Adaptive Objective Pathfinding via Improved Deep Reinforcement Learning Among Building Fire Hazard
Zhang, Ke
Zhu, Dandan
Xu, Qiuhan
Zhou, Hao
Peng, Xuemei
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT IX, 2024, 15024 : 265 - 279
[29] DQN-SCI: A Reinforcement Learning Method for Sequential Causal Inference
Tian, Enqi
Lyu, Shengfei
Chen, Huanhuan
Liu, Lei
Li, Bin
2024 10TH INTERNATIONAL CONFERENCE ON BIG DATA AND INFORMATION ANALYTICS, BIGDIA 2024, 2024, : 777 - 784
[30] Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
Anschel, Oron
Baram, Nir
Shimkin, Nahum
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70

← 1 2 3 4 5 →