Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models

被引:27
|
作者
Wang, Shuo [1 ]
Nepal, Surya [2 ]
Rudolph, Carsten [3 ]
Grobler, Marthie [2 ]
Chen, Shangyu [4 ]
Chen, Tianle [3 ]
机构
[1] Monash Univ, Fac Informat Technol, Clayton, Vic 3800, Australia
[2] CSIROs Data61, Melbourne, Vic 3008, Australia
[3] Monash Univ, Fac Informat Technol, Melbourne, Vic 3800, Australia
[4] Univ Melbourne, Melbourne, Vic 3010, Australia
关键词
Web service; deep neural network; backdoor attack; transfer learning; pre-trained model;
D O I
10.1109/TSC.2020.3000900
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Transfer learning provides an effective solution for feasibly and fast customize accurate Student models, by transferring the learned knowledge of pre-trained Teacher models over large datasets via fine-tuning. Many pre-trained Teacher models used in transfer learning are publicly available and maintained by public platforms, increasing their vulnerability to backdoor attacks. In this article, we demonstrate a backdoor threat to transfer learning tasks on both image and time-series data leveraging the knowledge of publicly accessible Teacher models, aimed at defeating three commonly adopted defenses: pruning-based, retraining-based and input pre-processing-based defenses. Specifically, (A) ranking-based selection mechanism to speed up the backdoor trigger generation and perturbation process while defeating pruning-based and/or retraining-based defenses. (B) autoencoder-powered trigger generation is proposed to produce a robust trigger that can defeat the input pre-processing-based defense, while guaranteeing that selected neuron (s) can be significantly activated. (C) defense-aware retraining to generate the manipulated model using reverse-engineered model inputs. We launch effective misclassification attacks on Student models over real-world images, brain Magnetic Resonance Imaging (MRI) data and Electrocardiography (ECG) learning systems. The experiments reveal that our enhanced attack can maintain the 98.4 and 97.2 percent classification accuracy as the genuine model on clean image and time series inputs while improving 27.9% - 100% and 27.1% - 56.1% attack success rate on trojaned image and time series inputs respectively in the presence of pruning-based and/or retraining-based defenses.
引用
收藏
页码:1526 / 1539
页数:14
相关论文
共 50 条
  • [1] Aliasing Backdoor Attacks on Pre-trained Models
    Wei, Cheng'an
    Lee, Yeonjoon
    Chen, Kai
    Meng, Guozhu
    Lv, Peizhuo
    PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2707 - 2724
  • [2] BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning
    Jia, Jinyuan
    Liu, Yupei
    Gong, Neil Zhenqiang
    43RD IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2022), 2022, : 2043 - 2059
  • [3] Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
    Li, Linyang
    Song, Demin
    Li, Xiaonan
    Zeng, Jiehang
    Ma, Ruotian
    Qiu, Xipeng
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3023 - 3032
  • [4] Backdoor Pre-trained Models Can Transfer to All
    Shen, Lujia
    Ji, Shouling
    Zhang, Xuhong
    Li, Jinfeng
    Chen, Jing
    Shi, Jie
    Fang, Chengfang
    Yin, Jianwei
    Wang, Ting
    CCS '21: PROCEEDINGS OF THE 2021 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 3141 - 3158
  • [5] Transfer learning with pre-trained conditional generative models
    Shin’ya Yamaguchi
    Sekitoshi Kanai
    Atsutoshi Kumagai
    Daiki Chijiwa
    Hisashi Kashima
    Machine Learning, 2025, 114 (4)
  • [6] Towards Inadequately Pre-trained Models in Transfer Learning
    Deng, Andong
    Li, Xingjian
    Hu, Di
    Wang, Tianyang
    Xiong, Haoyi
    Xu, Cheng-Zhong
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 19340 - 19351
  • [7] CBAs: Character-level Backdoor Attacks against Chinese Pre-trained Language Models
    He, Xinyu
    Hao, Fengrui
    Gu, Tianlong
    Chang, Liang
    ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2024, 27 (03)
  • [8] Multi-target Backdoor Attacks for Code Pre-trained Models
    Li, Yanzhou
    Liu, Shangqing
    Chen, Kangjie
    Xie, Xiaofei
    Zhang, Tianwei
    Liu, Yang
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7236 - 7254
  • [9] Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks
    Xi, Zhaohan
    Du, Tianyu
    Li, Changjiang
    Pang, Ren
    Ji, Shouling
    Chen, Jinghui
    Ma, Fenglong
    Wang, Ting
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Meta Distant Transfer Learning for Pre-trained Language Models
    Wang, Chengyu
    Pan, Haojie
    Qiu, Minghui
    Yang, Fei
    Huang, Jun
    Zhang, Yin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752