Understanding adversarial attacks on observations in deep reinforcement learning

被引:0
|
作者
You, Qiaoben [1 ]
Ying, Chengyang [1 ]
Zhou, Xinning [1 ]
Su, Hang [1 ,2 ]
Zhu, Jun [1 ,2 ]
Zhang, Bo [1 ]
机构
[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Tsinghua Bosch Joint Ctr Machine Learning, Inst Artificial Intelligence,Dept Comp Sci & Techn, Beijing 100084, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
deep learning; reinforcement learning; adversarial robustness; adversarial attack; GO;
D O I
10.1007/s11432-021-3688-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning models are vulnerable to adversarial attacks that can decrease the cumulative expected reward of a victim by manipulating its observations. Despite the efficiency of previous optimization-based methods for generating adversarial noise in supervised learning, such methods might not achieve the lowest cumulative reward since they do not generally explore the environmental dynamics. Herein, a framework is provided to better understand the existing methods by reformulating the problem of adversarial attacks on reinforcement learning in the function space. The reformulation approach adopted herein generates an optimal adversary in the function space of targeted attacks, repelling them via a generic two-stage framework. In the first stage, a deceptive policy is trained by hacking the environment and discovering a set of trajectories routing to the lowest reward or the worst-case performance. Next, the adversary misleads the victim to imitate the deceptive policy by perturbing the observations. Compared to existing approaches, it is theoretically shown that our adversary is strong under an appropriate noise level. Extensive experiments demonstrate the superiority of the proposed method in terms of efficiency and effectiveness, achieving state-of-the-art performance in both Atari and MuJoCo environments.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Threat of Adversarial Attacks within Deep Learning: Survey
    Ata-Us-samad
    Singh R.
    [J]. Recent Advances in Computer Science and Communications, 2023, 16 (07)
  • [42] Adversarial Robustness in Deep Learning: Attacks on Fragile Neurons
    Pravin, Chandresh
    Martino, Ivan
    Nicosia, Giuseppe
    Ojha, Varun
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 16 - 28
  • [43] Adversarial attacks on deep learning models in smart grids
    Hao, Jingbo
    Tao, Yang
    [J]. ENERGY REPORTS, 2022, 8 : 123 - 129
  • [44] Defending Deep Learning Models Against Adversarial Attacks
    Mani, Nag
    Moh, Melody
    Moh, Teng-Sheng
    [J]. INTERNATIONAL JOURNAL OF SOFTWARE SCIENCE AND COMPUTATIONAL INTELLIGENCE-IJSSCI, 2021, 13 (01): : 72 - 89
  • [45] Deep Learning Defense Method Against Adversarial Attacks
    Wang, Ling
    Zhang, Cheng
    Liu, Jie
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3667 - 3671
  • [46] MASSIF: Interactive Interpretation of Adversarial Attacks on Deep Learning
    Das, Nilaksh
    Park, Haekyu
    Wang, Zijie J.
    Hohman, Fred
    Firstman, Robert
    Rogers, Emily
    Chau, Duen Horng
    [J]. CHI'20: EXTENDED ABSTRACTS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2020,
  • [47] Robustness and Security in Deep Learning: Adversarial Attacks and Countermeasures
    Kaur, Navjot
    Singh, Someet
    Deore, Shailesh Shivaji
    Vidhate, Deepak A.
    Haridas, Divya
    Kosuri, Gopala Varma
    Kolhe, Mohini Ravindra
    [J]. JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 1250 - 1257
  • [48] Enhancing Conversational Model With Deep Reinforcement Learning and Adversarial Learning
    Tran, Quoc-Dai Luong
    Le, Anh-Cuong
    Huynh, Van-Nam
    [J]. IEEE ACCESS, 2023, 11 : 75955 - 75970
  • [49] Transcend Adversarial Examples: Diversified Adversarial Attacks to Test Deep Learning Model
    Kong, Wei
    [J]. 2023 IEEE 41ST INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, ICCD, 2023, : 13 - 20
  • [50] Adversarial Deep Learning: A Survey on Adversarial Attacks and Defense Mechanisms on Image Classification
    Khamaiseh, Samer Y.
    Bagagem, Derek
    Al-Alaj, Abdullah
    Mancino, Mathew
    Alomari, Hakam W.
    [J]. IEEE ACCESS, 2022, 10 : 102266 - 102291