Combating Word-level Adversarial Text with Robust Adversarial Training

被引:0
|
作者
Du, Xiaohu [1 ]
Yu, Jie [1 ]
Li, Shasha [1 ]
Yi, Zibo [1 ]
Liu, Hai [2 ]
Ma, Jun [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
[2] Logist Res Inst Sci & Technol, Beijing, Peoples R China
关键词
Deep neural network; Deep learning; Adversarial training; Adversarial attack; STABILITY;
D O I
10.1109/IJCNN52387.2021.9533725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NLP models perform well on many tasks, but they are also easy to be fooled by adversarial examples. A small perturbation can change the output of the deep neural network model. This kind of perturbation is hard to be perceived by humans, especially adversarial examples generated by word-level adversarial attack. Character-level adversarial attack can be defended by grammar detection and word recognition. The existing word-level textual adversarial attacks are based on synonym replacement, so adversarial texts usually have correct grammar and semantics. The defense of word-level adversarial attack is more challenging. In this paper, we propose a framework which is called Robust Adversarial Training (RAT) to defend against word-level adversarial attacks. RAT enhances the model by combining adversarial training and data perturbation during training. Our experiments on two datasets show that the model based on our framework can effectively defend against word-level adversarial attacks. Compared with the existing defense methods, the model trained under RAT has a higher defense success rate on 1000 adversarial examples. In addition, the accuracy of our model on the standard testing set is also better than the existing defense methods, and the accuracy is very close to or even higher than that of the standard model.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Combating Multi-level Adversarial Text with Pruning based Adversarial Training
    Ke, Jianpeng
    Wang, Lina
    Ye, Aoshuang
    Fu, Jie
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [2] WARP: Word-level Adversarial ReProgramming
    Hambardzumyan, Karen
    Khachatrian, Hrant
    May, Jonathan
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4921 - 4933
  • [3] Combating Adversarial Misspellings with Robust Word Recognition
    Pruthi, Danish
    Dhingra, Bhuwan
    Lipton, Zachary C.
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5582 - 5591
  • [4] Detecting Word-Level Adversarial Text Attacks via SHapley Additive exPlanations
    Huber, Lukas
    Kuehn, Marc Alexander
    Mosca, Edoardo
    Groh, Georg
    [J]. PROCEEDINGS OF THE 7TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2022, : 156 - 166
  • [5] Joint Character-Level Word Embedding and Adversarial Stability Training to Defend Adversarial Text
    Liu, Hui
    Zhang, Yongzheng
    Wang, Yipeng
    Lin, Zheng
    Chen, Yige
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8384 - 8391
  • [6] Word-Level Textual Adversarial Attack in the Embedding Space
    Zhu, Bin
    Gu, Zhaoquan
    Xie, Yushun
    Wu, Danni
    Qian, Yaguan
    Wang, Le
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [7] Multiple Text Style Transfer by using Word-level Conditional Generative Adversarial Network with Two-Phase Training
    Lai, Chih-Te
    Hong, Yi-Te
    Chen, Hong-You
    Lu, Chi-Jen
    Lin, Shou-De
    [J]. 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3579 - 3584
  • [8] Contrasting Human- and Machine-Generated Word-Level Adversarial Examples for Text Classification
    Mozes, Maximilian
    Bartolo, Max
    Stenetorp, Pontus
    Kleinberg, Bennett
    Griffin, Lewis D.
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8258 - 8270
  • [9] Efficient Combinatorial Optimization for Word-Level Adversarial Textual Attack
    Liu, Shengcai
    Lu, Ning
    Chen, Cheng
    Tang, Ke
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 98 - 111
  • [10] An Effective and Efficient Method for Word-Level Textual Adversarial Attack
    Shi, Zhixin
    Ma, Yuru
    Yu, Xiaoyan
    [J]. 26TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2021), 2021,