Black-box Detection of Backdoor Attacks with Limited Information and Data

被引：27

作者：

Dong, Yinpeng ^{[1
,2
,4
]}

Yang, Xiao ^{[1
,2
]}

Deng, Zhijie ^{[1
,2
]}

Pang, Tianyu ^{[1
,2
]}

Xiao, Zihao ^{[4
]}

Su, Hang ^{[1
,2
,3
]}

Zhu, Jun ^{[1
,2
,3
,4
]}

机构：

[1] Tsinghua Bosch Joint ML Ctr, Inst AI, Dept Comp Sci & Tech, Beijing, Peoples R China

[2] Tsinghua Univ, THBI Lab, BNRist Ctr, Beijing 100084, Peoples R China

[3] Pazhou Lab, Guangzhou 510330, Peoples R China

[4] Real AI, Beijing, Peoples R China

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

关键词：

D O I：

10.1109/ICCV48922.2021.01617

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although deep neural networks (DNNs) have made rapid progress in recent years, they are vulnerable in adversarial environments. A malicious backdoor could be embedded in a model by poisoning the training dataset, whose intention is to make the infected model give wrong predictions during inference when the specific trigger appears. To mitigate the potential threats of backdoor attacks, various backdoor detection and defense methods have been proposed. However, the existing techniques usually require the poisoned training data or access to the white-box model, which is commonly unavailable in practice. In this paper, we propose a blackbox backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. We introduce a gradient-free optimization algorithm to reverse-engineer the potential trigger for each class, which helps to reveal the existence of backdoor attacks. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models. Extensive experiments on hundreds of DNN models trained on several datasets corroborate the effectiveness of our method under the black-box setting against various backdoor attacks.

引用

页码：16462 / 16471

页数：10

共 50 条

[1] Black-box Adversarial Attacks with Limited Queries and Information
Ilyas, Andrew
Engstrom, Logan
Athalye, Anish
Lin, Jessy
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[2] Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks
Wang, Tong
Yao, Yuan
Xu, Feng
Xu, Miao
An, Shengwei
Wang, Ting
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 274 - 282
[3] Generative Adversarial Networks for Black-Box API Attacks with Limited Training Data
Shi, Yi
Sagduyu, Yalin E.
Davaslioglu, Kemal
Li, Jason H.
[J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2018, : 453 - 458
[4] Black-Box Graph Backdoor Defense
Yang, Xiao
Li, Gaolei
Tao, Xiaoyi
Zhang, Chaofeng
Li, Jianhua
[J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT V, 2024, 14491 : 163 - 180
[5] Backdoor attacks on black-box ciphers exploiting low-entropy plaintexts
Young, A
Yung, M
[J]. INFORMATION SECURITY AND PRIVACY, PROCEEDINGS, 2003, 2727 : 297 - 311
[6] Black-Box Data Poisoning Attacks on Crowdsourcing
Chen, Pengpeng
Yang, Yongqiang
Yang, Dingqi
Sun, Hailong
Chen, Zhijun
Lin, Peng
[J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2975 - 2983
[7] B3: Backdoor Attacks against Black-box Machine Learning Models
Gong, Xueluan
Chen, Yanjiao
Yang, Wenbin
Huang, Huayang
Wang, Qian
[J]. ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2023, 26 (04)
[8] Differential Analysis of Triggers and Benign Features for Black-Box DNN Backdoor Detection
Fu, Hao
Krishnamurthy, Prashanth
Garg, Siddharth
Khorrami, Farshad
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 4668 - 4680
[9] Black-box Attacks Against Neural Binary Function Detection
Bundt, Joshua
Davinroy, Michael
Agadakos, Ioannis
Oprea, Alina
Robertson, William
[J]. PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON RESEARCH IN ATTACKS, INTRUSIONS AND DEFENSES, RAID 2023, 2023, : 1 - 16
[10] Black-box Attacks to Log-based Anomaly Detection
Huang, Shaohan
Liu, Yi
Fung, Carol
Yang, Hailong
Luan, Zhongzhi
[J]. 2022 18TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM 2022): INTELLIGENT MANAGEMENT OF DISRUPTIVE NETWORK TECHNOLOGIES AND SERVICES, 2022, : 310 - 316

← 1 2 3 4 5 →