Textual Backdoor Attack via Keyword Positioning

被引:0
|
作者
Chen, Depeng [1 ,2 ]
Mao, Fangfang [1 ]
Jin, Hulin [1 ]
Cui, Jie [1 ,2 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei 230026, Peoples R China
关键词
Backdoor attack; NLP; DNN;
D O I
10.1007/978-981-97-5609-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The backdoor problem poses a potential threat to the security of neural networks. Although backdoor attacks have been extensively studied in the field of CV, they cannot be directly applied to the field of NLP due to the discrete nature of data characteristics. Data poisoning attacks are a common strategy in backdoor attacks in the NLP field, such as replacing or inserting triggers (for example, rare words) into sentences. However, most of the previous work was to randomly select the location of the trigger to be replaced or inserted, and inserting rare words can cause abnormal natural language expression and can be easily detected. In response to the above problems, this paper proposes a textual back door attack technique based on keyword positioning. Keywords usually calculate the importance score of each word or word with a specific part of speech and find the most vulnerable words in the sentence, that is, the keywords that help the target model make judgments. Therefore, interference with these words often makes the target model make errors in judgment. In this article, we first calculate the importance score and part-of-speech label of each word in the sentence, then select the trigger word based on the false correlation between the single word and the target label, and finally perturb the position of the keyword. We conducted experiments on four text classification data sets, and the results showed that the attack we proposed can not only ensure the concealment of the trigger in most cases but also has a better attack than the baseline solution.
引用
收藏
页码:55 / 66
页数:12
相关论文
共 50 条
  • [1] Survey of Textual Backdoor Attack and Defense
    Zheng M.
    Lin Z.
    Liu Z.
    Fu P.
    Wang W.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (01): : 221 - 242
  • [2] Textual Backdoor Attack for the Text Classification System
    Kwon, Hyun
    Lee, Sanghyun
    SECURITY AND COMMUNICATION NETWORKS, 2021, 2021
  • [3] Textual Backdoor Defense via Poisoned Sample Recognition
    Shao, Kun
    Zhang, Yu
    Yang, Junan
    Liu, Hui
    APPLIED SCIENCES-BASEL, 2021, 11 (21):
  • [4] Conditional Backdoor Attack via JPEG Compression
    Duan, Qiuyu
    Hua, Zhongyun
    Liao, Qing
    Zhang, Yushu
    Zhang, Leo Yu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 18, 2024, : 19823 - 19831
  • [5] A Textual Clean-Label Backdoor Attack Strategy against Spam Detection
    Yerlikaya, Fahri Anil
    Bahtiyar, Serif
    2021 14TH INTERNATIONAL CONFERENCE ON SECURITY OF INFORMATION AND NETWORKS (SIN 2021), 2021,
  • [6] Link-Backdoor: Backdoor Attack on Link Prediction via Node Injection
    Zheng, Haibin
    Xiong, Haiyang
    Ma, Haonan
    Huang, Guohan
    Chen, Jinyin
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02) : 1816 - 1831
  • [7] Backdoor attack detection via prediction trustworthiness assessment
    Zhong, Nan
    Qian, Zhenxing
    Zhang, Xinpeng
    INFORMATION SCIENCES, 2024, 662
  • [8] Motif-Backdoor: Rethinking the Backdoor Attack on Graph Neural Networks via Motifs
    Zheng, Haibin
    Xiong, Haiyang
    Chen, Jinyin
    Ma, Haonan
    Huang, Guohan
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02): : 2479 - 2493
  • [9] A stealthy and robust backdoor attack via frequency domain transform
    Hou, Ruitao
    Huang, Teng
    Yan, Hongyang
    Ke, Lishan
    Tang, Weixuan
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2023, 26 (05): : 2767 - 2783
  • [10] A stealthy and robust backdoor attack via frequency domain transform
    Ruitao Hou
    Teng Huang
    Hongyang Yan
    Lishan Ke
    Weixuan Tang
    World Wide Web, 2023, 26 : 2767 - 2783