Survey of Adversarial Attack, Defense and Robustness Analysis for Natural Language Processing

被引:0
|
作者
Zheng H. [1 ]
Chen J. [1 ,2 ]
Zhang Y. [1 ]
Zhang X. [3 ]
Ge C. [4 ]
Liu Z. [4 ]
Ouyang Y. [5 ]
Ji S. [6 ]
机构
[1] College of Information Engineering, Zhejiang University of Technology, Hangzhou
[2] Cyberspace Security Research Institute, Zhejiang University of Technology, Hangzhou
[3] College of Control Science and Engineering, Zhejiang University, Hangzhou
[4] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing
[5] Nanjing Research Center, Huawei Technologies Co., Ltd., Nanjing
[6] College of Computer Science and Technology, Zhejiang University, Hangzhou
基金
中国国家自然科学基金;
关键词
Adversarial attack; Deep neural network; Defense; Natural language processing; Robustness;
D O I
10.7544/issn1000-1239.2021.20210304
中图分类号
学科分类号
摘要
With the rapid development of artificial intelligence, deep neural networks have been widely applied in the fields of computer vision, signal analysis, and natural language processing. It helps machines process understand and use human language through functions such as syntax analysis, semantic analysis, and text comprehension. However, existing studies have shown that deep models are vulnerable to the attacks from adversarial texts. Adding imperceptible adversarial perturbations to normal texts, natural language processing models can make wrong predictions. To improve the robustness of the natural language processing model, defense-related researches have also developed in recent years. Based on the existing researches, we comprehensively detail related works in the field of adversarial attacks, defenses, and robustness analysis in natural language processing tasks. Specifically, we first introduce the research tasks and related natural language processing models. Then, attack and defense approaches are stated separately. The certified robustness analysis and benchmark datasets of natural language processing models are further investigated and a detailed introduction of natural language processing application platforms and toolkits is provided. Finally, we summarize the development direction of research on attacks and defenses in the future. © 2021, Science Press. All right reserved.
引用
下载
收藏
页码:1727 / 1750
页数:23
相关论文
共 106 条
  • [1] Li Yachao, Xiong Deyi, Zhang Min, Overview of neural machine translation, Chinese Journal of Computers, 41, 12, pp. 100-121, (2018)
  • [2] Li Jinpeng, Zhang Chuang, Chen Xiaojun, Et al., Survey on automatic text summarization, Journal of Computer Research and Development, 58, 1, pp. 1-21, (2021)
  • [3] Ren Yi, Tan Xu, Qin Tao, Et al., Almost unsupervised text to speech and automatic speech recognition, Proc of the 36th Int Conf on Machine Learning, pp. 5410-5419, (2019)
  • [4] Wang Wei, Besancon R, Ferret O, Et al., Filtering and clustering relations for unsupervised information extraction in open domain, Proc of the 20th ACM Int Conf on Information and Knowledge Management, pp. 1405-1414, (2011)
  • [5] Lin Yungshen, Jiang Jungyi, Lee S., A similarity measure for text classification and clustering, IEEE Transactions on Knowledge and Data Engineering, 26, 7, pp. 1575-1590, (2014)
  • [6] Szegedy C, Zaremba W, Sutskever I, Et al., Intriguing properties of neural networks, Proc of the 2nd Int Conf on Learning Representations, pp. 1-10, (2014)
  • [7] Goodfellow I, Shlens J, Szegedy C., Explaining and harnessing adversarial examples, Proc of the 2nd Int Conf on Learning Representations, pp. 1-11, (2015)
  • [8] Jia Robin, Liang Percy, Adversarial examples for evaluating reading comprehension systems, Proc of the 2017 Conf on Empirical Methods in Natural Language Processing, pp. 2021-2031, (2017)
  • [9] Alzantot M, Sharma Y, Elgohary A, Et al., Generating natural language adversarial examples, Proc of the 2018 Conf on Empirical Methods in Natural Language Processing, pp. 2890-2896, (2018)
  • [10] Wallace E, Feng Shi, Kandpal N, Et al., Universal adversarial triggers for attacking and analyzing NLP, Proc of the 2019 Conf on Empirical Methods in Natural Language Processing and the 9th Int Joint Conf on Natural Language Processing, pp. 2153-2162, (2019)