Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

被引:0
|
作者
Chen, Tong [1 ]
Liu, Ji-Qiang [1 ]
Li, He [1 ]
Wang, Shuo-Ru [1 ]
Niu, Wen-Jia [1 ]
Tong, En-Dong [1 ]
Chang, Liang [2 ]
Chen, Qi Alfred [3 ]
Li, Gang [4 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Secur & Privacy Intelligent Trans, Beijing 100044, Peoples R China
[2] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China
[3] Univ Calif Irvine, Donald Bren Sch Informat & Comp Sci, Irvine, CA 92697 USA
[4] Deakin Univ, Ctr Cyber Secur Res & Innovat, Geelong, Vic 3216, Australia
基金
中国国家自然科学基金;
关键词
robustness assessment; skewness; sparseness; asynchronous advantage actor-critic; reinforcement learning; NEURAL-NETWORKS;
D O I
10.1007/s11390-021-1217-z
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning as autonomous learning is greatly driving artificial intelligence (AI) development to practical applications. Having demonstrated the potential to significantly improve synchronously parallel learning, the parallel computing based asynchronous advantage actor-critic (A3C) opens a new door for reinforcement learning. Unfortunately, the acceleration's inuence on A3C robustness has been largely overlooked. In this paper, we perform the first robustness assessment of A3C based on parallel computing. By perceiving the policy's action, we construct a global matrix of action probability deviation and define two novel measures of skewness and sparseness to form an integral robustness measure. Based on such static assessment, we then develop a dynamic robustness assessing algorithm through situational whole-space state sampling of changing episodes. Extensive experiments with different combinations of agent number and learning rate are implemented on an A3C-based pathfinding application, demonstrating that our proposed robustness assessment can effectively measure the robustness of A3C, which can achieve an accuracy of 83.3%.
引用
收藏
页码:1002 / 1021
页数:20
相关论文
共 24 条
  • [1] Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View
    Tong Chen
    Ji-Qiang Liu
    He Li
    Shuo-Ru Wang
    Wen-Jia Niu
    En-Dong Tong
    Liang Chang
    Qi Alfred Chen
    Gang Li
    [J]. Journal of Computer Science and Technology, 2021, 36 : 1002 - 1021
  • [2] Dynamic service function chain placement in mobile computing: An asynchronous advantage actor-critic based approach
    Jiang, Heling
    Xia, Hai
    Zare, Mansoureh
    [J]. TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2024, 35 (08):
  • [3] Adversarial retraining attack of asynchronous advantage actor-critic based pathfinding
    Chen Tong
    Liu Jiqiang
    Xiang Yingxiao
    Niu Wenjia
    Tong Endong
    Wang Shuoru
    Li He
    Chang Liang
    Li Gang
    Alfred, Chen Qi
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (05) : 2323 - 2346
  • [4] Traffic signal control method based on asynchronous advantage actor-critic
    Ye, Baolin
    Sun, Ruitao
    Wu, Weimin
    Chen, Bin
    Yao, Qing
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (08): : 1671 - 1680
  • [5] Optimization of Robot Environment Interaction Based on Asynchronous Advantage Actor-Critic Algorithm
    Xu, Jitang
    Chen, Qiang
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (06) : 1350 - 1359
  • [6] VMP-A3C: Virtual machines placement in cloud computing based on asynchronous advantage actor-critic algorithm
    Wei, Pengcheng
    Zeng, Yushan
    Yan, Bei
    Zhou, Jiahui
    Nikougoftar, Elaheh
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (05)
  • [7] Workflow scheduling based on asynchronous advantage actor-critic algorithm in multi-cloud environment
    Tang, Xuhao
    Liu, Fagui
    Wang, Bin
    Xu, Dishi
    Jiang, Jun
    Wu, Qingbo
    Chen, C. L. Philip
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [8] Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning
    Tang Lun
    He Xiaoyu
    Wang Xiao
    Tan Qi
    Hu Yanjuan
    Chen Qianbin
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1733 - 1741
  • [9] Design and application of adaptive PID controller based on asynchronous advantage actor-critic learning method
    Sun, Qifeng
    Du, Chengze
    Duan, Youxiang
    Ren, Hui
    Li, Hongqiang
    [J]. WIRELESS NETWORKS, 2021, 27 (05) : 3537 - 3547
  • [10] A new noise network and gradient parallelisation-based asynchronous advantage actor-critic algorithm
    Fei, Zhengshun
    Wang, Yanping
    Wang, Jinglong
    Liu, Kangling
    Huang, Bingqiang
    Tan, Ping
    [J]. IET CYBER-SYSTEMS AND ROBOTICS, 2022, 4 (03) : 175 - 188