Universal Soldier: Using Universal Adversarial Perturbations for Detecting Backdoor Attacks

被引:0
|
作者
Xu, Xiaoyun [1 ]
Ersoy, Oguzhan [1 ]
Tajalli, Behrad [1 ]
Picek, Stjepan [1 ]
机构
[1] Radboud Univ Nijmegen, Digital Secur Grp, Nijmegen, Netherlands
关键词
Universal Adversarial Perturbation; Backdoor Attack; Backdoor Detection;
D O I
10.1109/DSN-W60302.2024.00024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A deep learning model may be poisoned and still perform as expected when receiving a clean input but will misclassify when receiving a backdoored input. This is similar to universal adversarial perturbations (UAP). Indeed, UAPs are input-agnostic perturbations capable of misleading a well-trained model. We observe an intuitive phenomenon: UAPs generated from backdoored models need fewer perturbations than UAPs from clean models for a successful attack. UAPs from backdoored models tend to exploit the shortcut from all classes to the target class, built by the backdoor. Based on this finding, we propose a backdoor detection method called Universal Soldier for Backdoor Detection (USB). With it, we can reverse engineer potential backdoor triggers via UAPs. Experiments on 240 models show that USB effectively detects the injected backdoor and provides comparable or better results than state-of-the-art methods.
引用
收藏
页码:66 / 73
页数:8
相关论文
共 50 条
  • [1] Detection of backdoor attacks using targeted universal adversarial perturbations for deep neural networks
    Qu, Yubin
    Huang, Song
    Chen, Xiang
    Wang, Xingya
    Yao, Yongming
    JOURNAL OF SYSTEMS AND SOFTWARE, 2024, 207
  • [2] Universal adversarial backdoor attacks to fool vertical federated learning
    Chen, Peng
    Du, Xin
    Lu, Zhihui
    Chai, Hongfeng
    COMPUTERS & SECURITY, 2024, 137
  • [3] Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
    Zhang, Yanghao
    Ruan, Wenjie
    Wang, Fu
    Huang, Xiaowei
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2020), 2020, : 1412 - 1417
  • [4] Universal adversarial perturbations
    Moosavi-Dezfooli, Seyed-Mohsen
    Fawzi, Alhussein
    Fawzi, Omar
    Frossard, Pascal
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 86 - 94
  • [5] Steganographic universal adversarial perturbations
    Din, Salah Ud
    Akhtar, Naveed
    Younis, Shahzad
    Shafait, Faisal
    Mansoor, Atif
    Shafique, Muhammad
    PATTERN RECOGNITION LETTERS, 2020, 135 : 146 - 152
  • [6] Detecting the universal adversarial perturbations on high-density sEMG signals
    Xue, Bo
    Wu, Le
    Liu, Aiping
    Zhang, Xu
    Chen, Xiang
    Chen, Xun
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 149
  • [7] Universal adversarial perturbations generative network
    Wang, Zheng
    Yang, Yang
    Li, Jingjing
    Zhu, Xiaofeng
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (04): : 1725 - 1746
  • [8] Universal adversarial perturbations generative network
    Zheng Wang
    Yang Yang
    Jingjing Li
    Xiaofeng Zhu
    World Wide Web, 2022, 25 : 1725 - 1746
  • [9] Defense against Universal Adversarial Perturbations
    Akhtar, Naveed
    Liu, Jian
    Mian, Ajmal
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3389 - 3398
  • [10] TwinNet: A Double Sub-Network Framework for Detecting Universal Adversarial Perturbations
    Ruan, Yibin
    Dai, Jiazhu
    FUTURE INTERNET, 2018, 10 (03):