Survey of Adversarial Attack, Defense and Robustness Analysis for Natural Language Processing

被引：0

作者：

Zheng H. ^{[1
]}

Chen J. ^{[1
,2
]}

Zhang Y. ^{[1
]}

Zhang X. ^{[3
]}

Ge C. ^{[4
]}

Liu Z. ^{[4
]}

Ouyang Y. ^{[5
]}

Ji S. ^{[6
]}

机构：

[1] College of Information Engineering, Zhejiang University of Technology, Hangzhou

[2] Cyberspace Security Research Institute, Zhejiang University of Technology, Hangzhou

[3] College of Control Science and Engineering, Zhejiang University, Hangzhou

[4] College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing

[5] Nanjing Research Center, Huawei Technologies Co., Ltd., Nanjing

[6] College of Computer Science and Technology, Zhejiang University, Hangzhou

来源：

Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2021年 / 58卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Adversarial attack; Deep neural network; Defense; Natural language processing; Robustness;

D O I：

10.7544/issn1000-1239.2021.20210304

中图分类号：

学科分类号：

摘要：

With the rapid development of artificial intelligence, deep neural networks have been widely applied in the fields of computer vision, signal analysis, and natural language processing. It helps machines process understand and use human language through functions such as syntax analysis, semantic analysis, and text comprehension. However, existing studies have shown that deep models are vulnerable to the attacks from adversarial texts. Adding imperceptible adversarial perturbations to normal texts, natural language processing models can make wrong predictions. To improve the robustness of the natural language processing model, defense-related researches have also developed in recent years. Based on the existing researches, we comprehensively detail related works in the field of adversarial attacks, defenses, and robustness analysis in natural language processing tasks. Specifically, we first introduce the research tasks and related natural language processing models. Then, attack and defense approaches are stated separately. The certified robustness analysis and benchmark datasets of natural language processing models are further investigated and a detailed introduction of natural language processing application platforms and toolkits is provided. Finally, we summarize the development direction of research on attacks and defenses in the future. © 2021, Science Press. All right reserved.

引用

下载

页码：1727 / 1750

页数：23

共 106 条

[61] Wallace E, Rodriguez P, Feng Shi, Et al., Trick me if you can: Human-in-the-loop generation of adversarial question answering examples, Transactions of the Association for Computational Linguistics, 7, 3, pp. 387-401, (2019)
[62] Cheng Minhao, Wei Wei, Hsieh C., Evaluating and enhancing the robustness of dialogue systems: A case study on a negotiation agent, Proc of the Int Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 3325-3335, (2019)
[63] Minervini P, Riedel S., Adversarially regularising neural NLI models to integrate logical background knowledge, Proc of the 22nd Conf on Computational Natural Language Learning, pp. 65-74, (2018)
[64] Wang Yicheng, Bansal M., Robust machine comprehension models via adversarial training, Proc of the Int Conf of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 575-581, (2018)
[65] Minervini P, Demeester T, Rocktaschel T, Et al., Adversarial sets for regularising neural link predictors, Proc of the 33rd Conf on Uncertainty in Artificial Intelligence, pp. 1-10, (2017)
[66] Miyato T, Dai A, Goodfellow I., Adversarial training methods for semi-supervised text classification, Proc of the 5th Int Conf on Learning Representations, pp. 1-11, (2017)
[67] Liu Xiaodong, Cheng Hao, He Pengcheng, Et al., Adversarial training for large neural language models, pp. 1-13, (2020)
[68] Liu Kai, Liu Xin, Yang An, Et al., A robust adversarial training approach to machine reading comprehension, Proc of the 34th AAAI Conf on Artificial Intelligence, pp. 8392-8400, (2020)
[69] Liu Hui, Zhang Yongzheng, Wang Yipeng, Et al., Joint character-level word embedding and adversarial stability training to defend adversarial text, Proc of the 34th AAAI Conf on Artificial Intelligence, pp. 8384-8391, (2020)
[70] Li Yitong, Baldwin T, Cohn T., Towards robust and privacy-preserving text representations, Proc of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 25-30, (2018)

← 2 3 4 5 6 7 8 9 10 11 →