Adversarial retraining attack of asynchronous advantage actor-critic based pathfinding

被引：2

作者：

Chen Tong ^{[1
]}

Liu Jiqiang ^{[1
]}

Xiang Yingxiao ^{[1
]}

Niu Wenjia ^{[1
]}

Tong Endong ^{[1
]}

Wang Shuoru ^{[1
]}

Li He ^{[1
]}

Chang Liang ^{[2
]}

Li Gang ^{[3
]}

Alfred, Chen Qi ^{[4
]}

机构：

[1] Beijing Jiaotong Univ, Beijing Key Lab Secur & Privacy Intelligent Trans, 3 Shangyuan Village, Beijing 100044, Peoples R China

[2] Guilin Univ Elect Technol, Guangxi Key Lab Trusted Software, Guilin, Peoples R China

[3] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia

[4] Univ Calif Irvine, Donald Bren Sch Informat & Comp Sci ICS, Irvine, CA USA

来源：

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS | 2021年 / 36卷 / 05期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

A3C; evasion attack; pathfinding; reinforcement learning; retraining attack;

D O I：

10.1002/int.22380

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pathfinding becomes an important component in many real-world scenarios, such as popular warehouse systems and autonomous aircraft towing vehicles. With the development of reinforcement learning (RL) especially in the context of asynchronous advantage actor-critic (A3C), pathfinding is undergoing a revolution in terms of efficient parallel learning. Similar to other artificial intelligence-based applications, A3C-based pathfinding is also threatened by the adversarial attack. In this paper, we are the first to study the adversarial attack to A3C, that can unexpectedly wake up longtime retraining mechanism until successful pathfinding. We also discover an attack example generation to launch the attack based on gradient band, in which only one baffle of extremely few unit lengths can successfully perform the attack. Experiments with detailed analysis are conducted to show a high attack success rate of 95% with an average baffle length of 2.95. We also discuss defense suggestions leveraging the insights from our analysis.

引用

页码：2323 / 2346

页数：24

共 50 条

[1] Asynchronous Advantage Actor-Critic with Double Attention Mechanisms
Ling X.-H.
Li J.
Zhu F.
Liu Q.
Fu Y.-C.
Zhu, Fei (zhufei@suda.edu.cn), 2020, Science Press (43): : 93 - 106
[2] Traffic signal control method based on asynchronous advantage actor-critic
Ye, Baolin
Sun, Ruitao
Wu, Weimin
Chen, Bin
Yao, Qing
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (08): : 1671 - 1680
[3] An accelerated asynchronous advantage actor-critic algorithm applied in papermaking
Wang, Xuechun
Zhuang, Zhiwei
Zou, Luobao
Zhang, Weidong
PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 8637 - 8642
[4] Optimization of Robot Environment Interaction Based on Asynchronous Advantage Actor-Critic Algorithm
Xu, Jitang
Chen, Qiang
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (06) : 1350 - 1359
[5] An Efficient Reconfigurable Battery Network Based on the Asynchronous Advantage Actor-Critic Paradigm
Yang, Feng
Meng, Jinhao
Ci, Marvin
Lin, Ni
Gao, Fei
IEEE TRANSACTIONS ON TRANSPORTATION ELECTRIFICATION, 2025, 11 (01): : 1479 - 1487
[6] Towards Understanding Asynchronous Advantage Actor-Critic: Convergence and Linear Speedup
Shen, Han
Zhang, Kaiqing
Hong, Mingyi
Chen, Tianyi
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 2579 - 2594
[7] Generative Adversarial Soft Actor-Critic
Hwang, Hyo-Seok
Kim, Yoojoong
Seok, Junhee
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[8] Workflow scheduling based on asynchronous advantage actor-critic algorithm in multi-cloud environment
Tang, Xuhao
Liu, Fagui
Wang, Bin
Xu, Dishi
Jiang, Jun
Wu, Qingbo
Chen, C. L. Philip
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
[9] Resource allocation Algorithm of Service Function Chain Based on Asynchronous Advantage Actor-Critic Learning
Tang Lun
He Xiaoyu
Wang Xiao
Tan Qi
Hu Yanjuan
Chen Qianbin
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1733 - 1741
[10] Design and application of adaptive PID controller based on asynchronous advantage actor-critic learning method
Sun, Qifeng
Du, Chengze
Duan, Youxiang
Ren, Hui
Li, Hongqiang
WIRELESS NETWORKS, 2021, 27 (05) : 3537 - 3547

← 1 2 3 4 5 →