Understanding adversarial attacks on observations in deep reinforcement learning

被引：0

作者：

You, Qiaoben ^{[1
]}

Ying, Chengyang ^{[1
]}

Zhou, Xinning ^{[1
]}

Su, Hang ^{[1
,2
]}

Zhu, Jun ^{[1
,2
]}

Zhang, Bo ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Tsinghua Bosch Joint Ctr Machine Learning, Inst Artificial Intelligence,Dept Comp Sci & Techn, Beijing 100084, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China

来源：

SCIENCE CHINA-INFORMATION SCIENCES | 2024年 / 67卷 / 05期

基金：

中国国家自然科学基金;

关键词：

deep learning; reinforcement learning; adversarial robustness; adversarial attack; GO;

D O I：

10.1007/s11432-021-3688-y

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep reinforcement learning models are vulnerable to adversarial attacks that can decrease the cumulative expected reward of a victim by manipulating its observations. Despite the efficiency of previous optimization-based methods for generating adversarial noise in supervised learning, such methods might not achieve the lowest cumulative reward since they do not generally explore the environmental dynamics. Herein, a framework is provided to better understand the existing methods by reformulating the problem of adversarial attacks on reinforcement learning in the function space. The reformulation approach adopted herein generates an optimal adversary in the function space of targeted attacks, repelling them via a generic two-stage framework. In the first stage, a deceptive policy is trained by hacking the environment and discovering a set of trajectories routing to the lowest reward or the worst-case performance. Next, the adversary misleads the victim to imitate the deceptive policy by perturbing the observations. Compared to existing approaches, it is theoretically shown that our adversary is strong under an appropriate noise level. Extensive experiments demonstrate the superiority of the proposed method in terms of efficiency and effectiveness, achieving state-of-the-art performance in both Atari and MuJoCo environments.

引用

页数：15

共 50 条

[41] Threat of Adversarial Attacks within Deep Learning: Survey
Ata-Us-samad
Singh R.
[J]. Recent Advances in Computer Science and Communications, 2023, 16 (07)
[42] Adversarial Robustness in Deep Learning: Attacks on Fragile Neurons
Pravin, Chandresh
Martino, Ivan
Nicosia, Giuseppe
Ojha, Varun
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 16 - 28
[43] Adversarial attacks on deep learning models in smart grids
Hao, Jingbo
Tao, Yang
[J]. ENERGY REPORTS, 2022, 8 : 123 - 129
[44] Defending Deep Learning Models Against Adversarial Attacks
Mani, Nag
Moh, Melody
Moh, Teng-Sheng
[J]. INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2021, 13 (01): : 72 - 89
[45] Deep Learning Defense Method Against Adversarial Attacks
Wang, Ling
Zhang, Cheng
Liu, Jie
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3667 - 3671
[46] MASSIF: Interactive Interpretation of Adversarial Attacks on Deep Learning
Das, Nilaksh
Park, Haekyu
Wang, Zijie J.
Hohman, Fred
Firstman, Robert
Rogers, Emily
Chau, Duen Horng
[J]. CHI'20: EXTENDED ABSTRACTS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2020,
[47] Robustness and Security in Deep Learning: Adversarial Attacks and Countermeasures
Kaur, Navjot
Singh, Someet
Deore, Shailesh Shivaji
Vidhate, Deepak A.
Haridas, Divya
Kosuri, Gopala Varma
Kolhe, Mohini Ravindra
[J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 1250 - 1257
[48] Enhancing Conversational Model With Deep Reinforcement Learning and Adversarial Learning
Tran, Quoc-Dai Luong
Le, Anh-Cuong
Huynh, Van-Nam
[J]. IEEE ACCESS, 2023, 11 : 75955 - 75970
[49] Transcend Adversarial Examples: Diversified Adversarial Attacks to Test Deep Learning Model
Kong, Wei
[J]. 2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD, 2023, : 13 - 20
[50] Adversarial Deep Learning: A Survey on Adversarial Attacks and Defense Mechanisms on Image Classification
Khamaiseh, Samer Y.
Bagagem, Derek
Al-Alaj, Abdullah
Mancino, Mathew
Alomari, Hakam W.
[J]. IEEE ACCESS, 2022, 10 : 102266 - 102291

← 1 2 3 4 5 →