Adversarial Training with Fast Gradient Projection Method against Synonym Substitution Based Text Attacks

被引:0
|
作者
Wang, Xiaosen [1 ]
Yang, Yichen [1 ]
Deng, Yihe [2 ]
He, Kun [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90024 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adversarial training is the most empirically successful approach in improving the robustness of deep neural networks for image classification. For text classification, however, existing synonym substitution based adversarial attacks are effective but not very efficient to be incorporated into practical text adversarial training. Gradient-based attacks, which are very efficient for images, are hard to be implemented for synonym substitution based text attacks due to the lexical, grammatical and semantic constraints and the discrete text input space. Thereby, we propose a fast text adversarial attack method called Fast Gradient Projection Method (FGPM) based on synonym substitution, which is about 20 times faster than existing text attack methods and could achieve similar attack performance. We then incorporate FGPM with adversarial training and propose a text defense method called Adversarial Training with FGPM enhanced by Logit pairing (ATFL). Experiments show that ATFL could significantly improve the model robustness and block the transferability of adversarial examples.
引用
收藏
页码:13997 / 14005
页数:9
相关论文
共 50 条
  • [21] Towards Robustness of Text-to-SQL Models against Synonym Substitution
    Gan, Yujian
    Chen, Xinyun
    Huang, Qiuping
    Purver, Matthew
    Woodward, John R.
    Xie, Jinxia
    Huang, Pengsheng
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2505 - 2515
  • [22] Defense Against Adversarial Attacks Using Topology Aligning Adversarial Training
    Kuang, Huafeng
    Liu, Hong
    Lin, Xianming
    Ji, Rongrong
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 3659 - 3673
  • [23] Perturbation analysis of gradient -based adversarial attacks
    Ozbulak, Utku
    Gasparyan, Manvel
    De Neve, Wesley
    Van Messem, Arnout
    PATTERN RECOGNITION LETTERS, 2020, 135 : 313 - 320
  • [24] A Defense Method Against Facial Adversarial Attacks
    Sadu, Chiranjeevi
    Das, Pradip K.
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 459 - 463
  • [25] Fine Tuning Lasso in an Adversarial Environment Against Gradient Attacks
    Ditzler, Gregory
    Prater, Ashley
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 1828 - 1834
  • [26] A General Adversarial Attack Method Based on Random Gradient Ascent and Spherical Projection
    Fan C.-L.
    Li Y.-D.
    Xia X.-F.
    Qiao J.-Z.
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2022, 43 (02): : 168 - 175
  • [27] Fast adversarial training method based on discrete cosine transform
    Wang, Xiaomiao
    Zhang, Yujin
    Zhang, Tao
    Tian, Jin
    Wu, Fei
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (11): : 2230 - 2238
  • [28] Paraphrasing Method Based on Contextual Synonym Substitution
    Barmawi, Ari Moesriami
    Muhammad, Ali
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2019, 13 (03) : 257 - 282
  • [29] Arabic Synonym BERT-based Adversarial Examples for Text Classification
    Alshahrani, Norah
    Alshahrani, Saied
    Wali, Esma
    Matthews, Jeanna
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: STUDENT RESEARCH WORKSHOP, 2024, : 137 - 147
  • [30] On the Effectiveness of Adversarial Training in Defending against Adversarial Example Attacks for Image Classification
    Park, Sanglee
    So, Jungmin
    APPLIED SCIENCES-BASEL, 2020, 10 (22): : 1 - 16