BadEncoder: Backdoor Attacks to Pre-trained Encoders in Self-Supervised Learning

被引:0
|
作者
Jia, Jinyuan [1 ]
Liu, Yupei [1 ]
Gong, Neil Zhenqiang [1 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
关键词
D O I
10.1109/SP46214.2022.00021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Self-supervised learning in computer vision aims to pre-train an image encoder using a large amount of unlabeled images or (image, text) pairs. The pre-trained image encoder can then be used as a feature extractor to build downstream classifiers for many downstream tasks with a small amount of or no labeled training data. In this work, we propose BadEncoder, the first backdoor attack to self-supervised learning. In particular, our BadEncoder injects backdoors into a pre-trained image encoder such that the downstream classifiers built based on the backdoored image encoder for different downstream tasks simultaneously inherit the backdoor behavior. We formulate our BadEncoder as an optimization problem and we propose a gradient descent based method to solve it, which produces a backdoored image encoder from a clean one. Our extensive empirical evaluation results on multiple datasets show that our BadEncoder achieves high attack success rates while preserving the accuracy of the downstream classifiers. We also show the effectiveness of BadEncoder using two publicly available, realworld image encoders, i.e., Google's image encoder pre-trained on ImageNet and OpenAI's Contrastive Language-Image Pre-training (CLIP) image encoder pre-trained on 400 million (image, text) pairs collected from the Internet. Moreover, we consider defenses including Neural Cleanse and MNTD (empirical defenses) as well as PatchGuard (a provable defense). Our results show that these defenses are insufficient to defend against BadEncoder, highlighting the needs for new defenses against our BadEncoder. Our code is publicly available at: https://github.com/jjy1994/BadEncoder.
引用
收藏
页码:2043 / 2059
页数:17
相关论文
共 50 条
  • [1] GhostEncoder: Stealthy backdoor attacks with dynamic triggers to pre-trained encoders in self-supervised learning
    Wang, Qiannan
    Yin, Changchun
    Fang, Liming
    Liu, Zhe
    Wang, Run
    Lin, Chenhao
    [J]. COMPUTERS & SECURITY, 2024, 142
  • [2] Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning
    Liu, Hongbin
    Qu, Wenjie
    Jia, Jinyuan
    Gong, Neil Zhenqiang
    [J]. PROCEEDINGS 45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS, SPW 2024, 2024, : 144 - 156
  • [3] Backdoor Attacks on Self-Supervised Learning
    Saha, Aniruddha
    Tejankar, Ajinkya
    Koohpayegani, Soroush Abbasi
    Pirsiavash, Hamed
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13327 - 13336
  • [4] Aliasing Backdoor Attacks on Pre-trained Models
    Wei, Cheng'an
    Lee, Yeonjoon
    Chen, Kai
    Meng, Guozhu
    Lv, Peizhuo
    [J]. PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2707 - 2724
  • [5] Backdoor Attacks Against Transfer Learning With Pre-Trained Deep Learning Models
    Wang, Shuo
    Nepal, Surya
    Rudolph, Carsten
    Grobler, Marthie
    Chen, Shangyu
    Chen, Tianle
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (03) : 1526 - 1539
  • [7] Enhancing Pre-trained Language Models by Self-supervised Learning for Story Cloze Test
    Xie, Yuqiang
    Hu, Yue
    Xing, Luxi
    Wang, Chunhui
    Hu, Yong
    Wei, Xiangpeng
    Sun, Yajing
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT (KSEM 2020), PT I, 2020, 12274 : 271 - 279
  • [8] Self-supervised Learning Based on a Pre-trained Method for the Subtype Classification of Spinal Tumors
    Jiao, Menglei
    Liu, Hong
    Yang, Zekang
    Tian, Shuai
    Ouyang, Hanqiang
    Li, Yuan
    Yuan, Yuan
    Liu, Jianfang
    Wang, Chunjie
    Lang, Ning
    Jiang, Liang
    Yuan, Huishu
    Qian, Yueliang
    Wang, Xiangdong
    [J]. COMPUTATIONAL MATHEMATICS MODELING IN CANCER ANALYSIS, CMMCA 2022, 2022, 13574 : 58 - 67
  • [9] SPIQ: A Self-Supervised Pre-Trained Model for Image Quality Assessment
    Chen, Pengfei
    Li, Leida
    Wu, Qingbo
    Wu, Jinjian
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 513 - 517
  • [10] Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
    Li, Linyang
    Song, Demin
    Li, Xiaonan
    Zeng, Jiehang
    Ma, Ruotian
    Qiu, Xipeng
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3023 - 3032