A Survey of Adversarial Defenses and Robustness in NLP

被引:21
|
作者
Goyal, Shreya [1 ]
Doddapaneni, Sumanth [1 ]
Khapra, Mitesh M. [1 ]
Ravindran, Balaraman [1 ]
机构
[1] Indian Inst Technol Madras, Bhupat & Jyoti Mehta Sch Biosci, Robert Bosch Ctr Data Sci & AI, Chennai 600036, Tamil Nadu, India
关键词
Adversarial attacks; adversarial defenses; perturbations; NLP; DEEP NEURAL-NETWORKS; COMPUTER VISION; ATTACKS;
D O I
10.1145/3593042
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In the past few years, it has become increasingly evident that deep neural networks are not resilient enough to withstand adversarial perturbations in input data, leaving them vulnerable to attack. Various authors have proposed strong adversarial attacks for computer vision and Natural Language Processing (NLP) tasks. As a response, many defense mechanisms have also been proposed to prevent these networks from failing. The significance of defending neural networks against adversarial attacks lies in ensuring that the model's predictions remain unchanged even if the input data is perturbed. Several methods for adversarial defense in NLP have been proposed, catering to different NLP tasks such as text classification, named entity recognition, and natural language inference. Some of these methods not only defend neural networks against adversarial attacks but also act as a regularization mechanism during training, saving the model from overfitting. This survey aims to review the various methods proposed for adversarial defenses in NLP over the past few years by introducing a novel taxonomy. The survey also highlights the fragility of advanced deep neural networks in NLP and the challenges involved in defending them.
引用
收藏
页数:39
相关论文
共 50 条
  • [1] Demystifying the Adversarial Robustness of Random Transformation Defenses
    Sitawarin, Chawin
    Golan-Strieb, Zachary
    Wagner, David
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Improving the Adversarial Robustness of NLP Models by Information Bottleneck
    Zhang, Cenyuan
    Zhou, Xiang
    Wan, Yixin
    Zheng, Xiaoqing
    Chang, Kai-Wei
    Hsieh, Cho-Jui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3588 - 3598
  • [3] Adversarial NLP for Social Network Applications: Attacks, Defenses, and Research Directions
    Alsmadi, Izzat
    Ahmad, Kashif
    Nazzal, Mahmoud
    Alam, Firoj
    Al-Fuqaha, Ala
    Khreishah, Abdallah
    Algosaibi, Abdulelah
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2023, 10 (06) : 3089 - 3108
  • [4] Measure and Improve Robustness in NLP Models: A Survey
    Wang, Xuezhi
    Wang, Haohan
    Yang, Diyi
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 4569 - 4586
  • [5] Evaluating Adversarial Robustness of Secret Key-Based Defenses
    Ali, Ziad Tariq Muhammad
    Mohammed, Ameer
    Ahmad, Imtiaz
    [J]. IEEE ACCESS, 2022, 10 : 34872 - 34882
  • [6] Evaluating the Adversarial Robustness of Adaptive Test-time Defenses
    Croce, Francesco
    Gowal, Sven
    Brunner, Thomas
    Shelhamer, Evan
    Hein, Matthias
    Cemgil, Taylan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [7] Survey on adversarial attacks and defenses for object detection
    Wang, Xinxin
    Chen, Jing
    He, Kun
    Zhang, Zijun
    Du, Ruiying
    Li, Qiao
    She, Jisi
    [J]. Tongxin Xuebao/Journal on Communications, 2023, 44 (11): : 260 - 277
  • [8] A Survey on Efficient Methods for Adversarial Robustness
    Muhammad, Awais
    Bae, Sung-Ho
    [J]. IEEE ACCESS, 2022, 10 : 118815 - 118830
  • [9] Beware the Black-Box: On the Robustness of Recent Defenses to Adversarial Examples
    Mahmood, Kaleel
    Gurevin, Deniz
    van Dijk, Marten
    Nguyen, Phuoung Ha
    [J]. ENTROPY, 2021, 23 (10)
  • [10] Adversarial attacks and defenses in explainable artificial intelligence: A survey
    Baniecki, Hubert
    Biecek, Przemyslaw
    [J]. INFORMATION FUSION, 2024, 107