Combating Word-level Adversarial Text with Robust Adversarial Training

被引:0
|
作者
Du, Xiaohu [1 ]
Yu, Jie [1 ]
Li, Shasha [1 ]
Yi, Zibo [1 ]
Liu, Hai [2 ]
Ma, Jun [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China
[2] Logist Res Inst Sci & Technol, Beijing, Peoples R China
关键词
Deep neural network; Deep learning; Adversarial training; Adversarial attack; STABILITY;
D O I
10.1109/IJCNN52387.2021.9533725
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
NLP models perform well on many tasks, but they are also easy to be fooled by adversarial examples. A small perturbation can change the output of the deep neural network model. This kind of perturbation is hard to be perceived by humans, especially adversarial examples generated by word-level adversarial attack. Character-level adversarial attack can be defended by grammar detection and word recognition. The existing word-level textual adversarial attacks are based on synonym replacement, so adversarial texts usually have correct grammar and semantics. The defense of word-level adversarial attack is more challenging. In this paper, we propose a framework which is called Robust Adversarial Training (RAT) to defend against word-level adversarial attacks. RAT enhances the model by combining adversarial training and data perturbation during training. Our experiments on two datasets show that the model based on our framework can effectively defend against word-level adversarial attacks. Compared with the existing defense methods, the model trained under RAT has a higher defense success rate on 1000 adversarial examples. In addition, the accuracy of our model on the standard testing set is also better than the existing defense methods, and the accuracy is very close to or even higher than that of the standard model.
引用
下载
收藏
页数:8
相关论文
共 50 条
  • [41] Building a Robust Word-Level Wakeword Verification Network
    Kumar, Rajath
    Rodehorst, Mike
    Wang, Joe
    Gu, Jiacheng
    Kulis, Brian
    INTERSPEECH 2020, 2020, : 1972 - 1976
  • [42] A noise robust method for word-level pronunciation assessment
    Lin, Binghuai
    Wang, Liyuan
    INTERSPEECH 2021, 2021, : 781 - 785
  • [43] Blind Adversarial Training: Towards Comprehensively Robust Models Against Blind Adversarial Attacks
    Xie, Haidong
    Xiang, Xueshuang
    Dong, Bin
    Liu, Naijin
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 15 - 26
  • [44] Design of robust hyperspectral image classifier based on adversarial training against adversarial attack
    Park I.
    Kim S.
    Journal of Institute of Control, Robotics and Systems, 2021, 27 (06) : 389 - 400
  • [45] Robust Graph Neural Networks Against Adversarial Attacks via Jointly Adversarial Training
    Tian, Hu
    Ye, Bowei
    Zheng, Xiaolong
    Wu, Desheng Dash
    IFAC PAPERSONLINE, 2020, 53 (05): : 420 - 425
  • [46] An analytical handwritten word recognition system with word-level discriminant training
    Tay, YH
    Lallican, PM
    Khalid, M
    Knerr, S
    Viard-Gaudin, C
    SIXTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, PROCEEDINGS, 2001, : 726 - 730
  • [47] COMBATING FALSE SENSE OF SECURITY: BREAKING THE DEFENSE OF ADVERSARIAL TRAINING VIA NON-GRADIENT ADVERSARIAL ATTACK
    Fan, Mingyuan
    Liu, Yang
    Chen, Cen
    Yu, Shengxing
    Guo, Wenzhong
    Liu, Ximeng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3293 - 3297
  • [48] CRank: Reusable Word Importance Ranking for Text Adversarial Attack
    Chen, Xinyi
    Liu, Bo
    APPLIED SCIENCES-BASEL, 2021, 11 (20):
  • [49] Adversarial training for few-shot text classification
    Croce, Danilo
    Castellucci, Giuseppe
    Basili, Roberto
    INTELLIGENZA ARTIFICIALE, 2020, 14 (02) : 201 - 214
  • [50] ARAML: A Stable Adversarial Training Framework for Text Generation
    Ke, Pei
    Huang, Fei
    Huang, Minlie
    Zhu, Xiaoyan
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 4271 - 4281