A Parallel Approach to Advantage Actor Critic in Deep Reinforcement Learning

被引：0

作者：

Zhu, Xing ^{[1
]}

Du, Yunfei ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China

来源：

ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2019, PT II | 2020年 / 11945卷

关键词：

Deep reinforcement learning; Advantage actor critic; Parallelization; MPI; Scalable;

D O I：

10.1007/978-3-030-38961-1_28

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep Reinforcement learning (DRL) algorithms recently still take a long time to train models in many applications. Parallelization has the potential to improve the efficiency of DRL algorithms. In this paper, we propose an parallel approach (ParaA2C) for the popular Actor-Critic (AC) algorithms in DRL, to accelerate the training process. Our work considers the parallelization of the basic advantage actor critic (Serial-A2C) in AC algorithms. Specifically, we use multiple actor-learners to mitigate the strong correlation of data and the instability of updating, and finally reduce the training time. Note that we assign each actor-learner MPI process to a CPU core, in order to prevent resource contention between MPI processes, and make our ParaA2C approach more scalable. We demonstrate the effectiveness of ParaA2C by performing on Arcade Learning Environment (ALE) platform. Notably, our ParaA2C approach takes less than 10 min to train in some commonly used Atari games when using 512 CPU cores.

引用

页码：320 / 327

页数：8

共 50 条

[41] DAG-based workflows scheduling using Actor–Critic Deep Reinforcement Learning
Koslovski, Guilherme Piêgas
Pereira, Kleiton
Albuquerque, Paulo Roberto
[J]. Future Generation Computer Systems, 2024, 150 : 354 - 363
[42] Dynamic spectrum access and sharing through actor-critic deep reinforcement learning
Dong, Liang
Qian, Yuchen
Xing, Yuan
[J]. EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2022, 2022 (01)
[43] A World Model for Actor-Critic in Reinforcement Learning
Panov, A. I.
Ugadiarov, L. A.
[J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
[44] Actor-Critic based Improper Reinforcement Learning
Zaki, Mohammadi
Mohan, Avinash
Gopalan, Aditya
Mannor, Shie
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[45] Curious Hierarchical Actor-Critic Reinforcement Learning
Roeder, Frank
Eppe, Manfred
Nguyen, Phuong D. H.
Wermter, Stefan
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
[46] The Actor-Dueling-Critic Method for Reinforcement Learning
Wu, Menghao
Gao, Yanbin
Jung, Alexander
Zhang, Qiang
Du, Shitong
[J]. SENSORS, 2019, 19 (07)
[47] Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Haarnoja, Tuomas
Zhou, Aurick
Abbeel, Pieter
Levine, Sergey
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[48] A fuzzy Actor-Critic reinforcement learning network
Wang, Xue-Song
Cheng, Yu-Hu
Yi, Jian-Qiang
[J]. INFORMATION SCIENCES, 2007, 177 (18) : 3764 - 3781
[49] A modified actor-critic reinforcement learning algorithm
Mustapha, SM
Lachiver, G
[J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
[50] Adversarially Trained Actor Critic for Offline Reinforcement Learning
Cheng, Ching-An
Xie, Tengyang
Jiang, Nan
Agarwal, Alekh
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →