Adversarial retraining attack of asynchronous advantage actor-critic based pathfinding

被引:2
|
作者
Chen Tong [1 ]
Liu Jiqiang [1 ]
Xiang Yingxiao [1 ]
Niu Wenjia [1 ]
Tong Endong [1 ]
Wang Shuoru [1 ]
Li He [1 ]
Chang Liang [2 ]
Li Gang [3 ]
Alfred, Chen Qi [4 ]
机构
[1] Beijing Jiaotong Univ, Beijing Key Lab Secur & Privacy Intelligent Trans, 3 Shangyuan Village, Beijing 100044, Peoples R China
[2] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin, Peoples R China
[3] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[4] Univ Calif Irvine, Donald Bren Sch Informat & Comp Sci ICS, Irvine, CA USA
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
A3C; evasion attack; pathfinding; reinforcement learning; retraining attack;
D O I
10.1002/int.22380
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pathfinding becomes an important component in many real-world scenarios, such as popular warehouse systems and autonomous aircraft towing vehicles. With the development of reinforcement learning (RL) especially in the context of asynchronous advantage actor-critic (A3C), pathfinding is undergoing a revolution in terms of efficient parallel learning. Similar to other artificial intelligence-based applications, A3C-based pathfinding is also threatened by the adversarial attack. In this paper, we are the first to study the adversarial attack to A3C, that can unexpectedly wake up longtime retraining mechanism until successful pathfinding. We also discover an attack example generation to launch the attack based on gradient band, in which only one baffle of extremely few unit lengths can successfully perform the attack. Experiments with detailed analysis are conducted to show a high attack success rate of 95% with an average baffle length of 2.95. We also discuss defense suggestions leveraging the insights from our analysis.
引用
收藏
页码:2323 / 2346
页数:24
相关论文
共 50 条
  • [1] Asynchronous Advantage Actor-Critic with Double Attention Mechanisms
    Ling X.-H.
    Li J.
    Zhu F.
    Liu Q.
    Fu Y.-C.
    Zhu, Fei (zhufei@suda.edu.cn), 2020, Science Press (43): : 93 - 106
  • [2] Traffic signal control method based on asynchronous advantage actor-critic
    Ye, Baolin
    Sun, Ruitao
    Wu, Weimin
    Chen, Bin
    Yao, Qing
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (08): : 1671 - 1680
  • [3] An accelerated asynchronous advantage actor-critic algorithm applied in papermaking
    Wang, Xuechun
    Zhuang, Zhiwei
    Zou, Luobao
    Zhang, Weidong
    PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8637 - 8642
  • [4] Optimization of Robot Environment Interaction Based on Asynchronous Advantage Actor-Critic Algorithm
    Xu, Jitang
    Chen, Qiang
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (06) : 1350 - 1359
  • [5] An Efficient Reconfigurable Battery Network Based on the Asynchronous Advantage Actor-Critic Paradigm
    Yang, Feng
    Meng, Jinhao
    Ci, Marvin
    Lin, Ni
    Gao, Fei
    IEEE TRANSACTIONS ON TRANSPORTATION ELECTRIFICATION, 2025, 11 (01): : 1479 - 1487
  • [6] Towards Understanding Asynchronous Advantage Actor-Critic: Convergence and Linear Speedup
    Shen, Han
    Zhang, Kaiqing
    Hong, Mingyi
    Chen, Tianyi
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 2579 - 2594
  • [7] Generative Adversarial Soft Actor-Critic
    Hwang, Hyo-Seok
    Kim, Yoojoong
    Seok, Junhee
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [8] Workflow scheduling based on asynchronous advantage actor-critic algorithm in multi-cloud environment
    Tang, Xuhao
    Liu, Fagui
    Wang, Bin
    Xu, Dishi
    Jiang, Jun
    Wu, Qingbo
    Chen, C. L. Philip
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [9] Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning
    Tang Lun
    He Xiaoyu
    Wang Xiao
    Tan Qi
    Hu Yanjuan
    Chen Qianbin
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1733 - 1741
  • [10] Design and application of adaptive PID controller based on asynchronous advantage actor-critic learning method
    Sun, Qifeng
    Du, Chengze
    Duan, Youxiang
    Ren, Hui
    Li, Hongqiang
    WIRELESS NETWORKS, 2021, 27 (05) : 3537 - 3547