A Parallel Approach to Advantage Actor Critic in Deep Reinforcement Learning

被引:0
|
作者
Zhu, Xing [1 ]
Du, Yunfei [1 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
关键词
Deep reinforcement learning; Advantage actor critic; Parallelization; MPI; Scalable;
D O I
10.1007/978-3-030-38961-1_28
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Reinforcement learning (DRL) algorithms recently still take a long time to train models in many applications. Parallelization has the potential to improve the efficiency of DRL algorithms. In this paper, we propose an parallel approach (ParaA2C) for the popular Actor-Critic (AC) algorithms in DRL, to accelerate the training process. Our work considers the parallelization of the basic advantage actor critic (Serial-A2C) in AC algorithms. Specifically, we use multiple actor-learners to mitigate the strong correlation of data and the instability of updating, and finally reduce the training time. Note that we assign each actor-learner MPI process to a CPU core, in order to prevent resource contention between MPI processes, and make our ParaA2C approach more scalable. We demonstrate the effectiveness of ParaA2C by performing on Arcade Learning Environment (ALE) platform. Notably, our ParaA2C approach takes less than 10 min to train in some commonly used Atari games when using 512 CPU cores.
引用
收藏
页码:320 / 327
页数:8
相关论文
共 50 条
  • [41] DAG-based workflows scheduling using Actor–Critic Deep Reinforcement Learning
    Koslovski, Guilherme Piêgas
    Pereira, Kleiton
    Albuquerque, Paulo Roberto
    [J]. Future Generation Computer Systems, 2024, 150 : 354 - 363
  • [42] Dynamic spectrum access and sharing through actor-critic deep reinforcement learning
    Dong, Liang
    Qian, Yuchen
    Xing, Yuan
    [J]. EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2022, 2022 (01)
  • [43] A World Model for Actor-Critic in Reinforcement Learning
    Panov, A. I.
    Ugadiarov, L. A.
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
  • [44] Actor-Critic based Improper Reinforcement Learning
    Zaki, Mohammadi
    Mohan, Avinash
    Gopalan, Aditya
    Mannor, Shie
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [45] Curious Hierarchical Actor-Critic Reinforcement Learning
    Roeder, Frank
    Eppe, Manfred
    Nguyen, Phuong D. H.
    Wermter, Stefan
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
  • [46] The Actor-Dueling-Critic Method for Reinforcement Learning
    Wu, Menghao
    Gao, Yanbin
    Jung, Alexander
    Zhang, Qiang
    Du, Shitong
    [J]. SENSORS, 2019, 19 (07)
  • [47] Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
    Haarnoja, Tuomas
    Zhou, Aurick
    Abbeel, Pieter
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [48] A fuzzy Actor-Critic reinforcement learning network
    Wang, Xue-Song
    Cheng, Yu-Hu
    Yi, Jian-Qiang
    [J]. INFORMATION SCIENCES, 2007, 177 (18) : 3764 - 3781
  • [49] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    [J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609
  • [50] Adversarially Trained Actor Critic for Offline Reinforcement Learning
    Cheng, Ching-An
    Xie, Tengyang
    Jiang, Nan
    Agarwal, Alekh
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,