Improving query efficiency of black-box attacks via the preference of models

被引：1

作者：

Yang, Xiangyuan ^{[1
]}

Lin, Jie ^{[1
]}

Zhang, Hanlin ^{[2
]}

Zhao, Peng ^{[1
]}

机构：

[1] Xi'an Jiaotong Univ, Sch Comp Sci & Technol, Xian, Peoples R China

[2] Qingdao Univ, Qingdao, Peoples R China

来源：

INFORMATION SCIENCES | 2024年 / 678卷

关键词：

Black-box query attack; Gradient-aligned attack; Preference property; Gradient preference; ROBUSTNESS;

D O I：

10.1016/j.ins.2024.121013

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Black-box query attacks are effective at compromising deep-learning models using only the model's output. These attacks typically face challenges with low attack success rates (ASRs) when limited to fewer than ten queries per example. Recent approaches have improved ASRs due to the transferability of initial perturbations, yet they still suffer from inefficient querying. Our study introduces the Gradient-Aligned Attack (GAA) to enhance ASRs with minimal perturbation by focusing on the model's preference. We define a preference property where the generated adversarial example prefers to be misclassified as the wrong category with a high initial confidence. This property is further elucidated by the gradient preference, suggesting a positive correlation between the magnitude of a coefficient in a partial derivative and the norm of the derivative itself. Utilizing this, we devise the gradient-aligned CE (GACE) loss to precisely estimate gradients by aligning these coefficients between the surrogate and victim models, with coefficients assessed by the victim model's outputs. GAA, based on the GACE loss, also aims to achieve the smallest perturbation. Our tests on ImageNet, CIFAR10, and Imagga API show that GAA can increase ASRs by 25.7% and 40.3% for untargeted and targeted attacks respectively, while only needing minimally disruptive perturbations. Furthermore, the GACE loss reduces the number of necessary queries by up to 2.5x and enhances the transferability of advanced attacks by up to 14.2%, especially when using an ensemble surrogate model. Code is available at https:// github .com /HaloMoto /GradientAlignedAttack.

引用

页数：21

共 50 条

[31] Query-Efficient Black-Box Red Teaming via Bayesian Optimization
Lee, Deokjae
Lee, JunYeong
Ha, Jung-Woo
Kim, Jin-Hwa
Lee, Sang-Woo
Lee, Hwaran
Song, Hyun Oh
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11551 - 11574
[32] Query-Efficient Hard-Label Black-Box Attacks Using Biased Sampling
Liu, Sijia
Sun, Jian
Li, Jun
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 3872 - 3877
[33] Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes
Shukla, Satya Narayan
Sahu, Anit Kumar
Willmott, Devin
Kolter, Zico
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1461 - 1469
[34] Query-Efficient Black-Box Adversarial Attacks Guided by a Transfer-Based Prior
Dong, Yinpeng
Cheng, Shuyu
Pang, Tianyu
Su, Hang
Zhu, Jun
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9536 - 9548
[35] Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box Attacks
Li, Huiying
Shan, Shawn
Wenger, Emily
Zhang, Jiayun
Zheng, Haitao
Zhao, Ben Y.
PROCEEDINGS OF THE 31ST USENIX SECURITY SYMPOSIUM, 2022, : 2117 - 2134
[36] Evaluation of Four Black-box Adversarial Attacks and Some Query-efficient Improvement Analysis
Wang, Rui
2022 PROGNOSTICS AND HEALTH MANAGEMENT CONFERENCE, PHM-LONDON 2022, 2022, : 298 - 302
[37] On the Effectiveness of Small Input Noise for Defending Against Query-based Black-Box Attacks
Byun, Junyoung
Go, Hyojun
Kim, Changick
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3819 - 3828
[38] Black-Box Data Poisoning Attacks on Crowdsourcing
Chen, Pengpeng
Yang, Yongqiang
Yang, Dingqi
Sun, Hailong
Chen, Zhijun
Lin, Peng
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2975 - 2983
[39] Toward Visual Distortion in Black-Box Attacks
Li, Nannan
Chen, Zhenzhong
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6156 - 6167
[40] Resiliency of SNN on Black-Box Adversarial Attacks
Paudel, Bijay Raj
Itani, Aashish
Tragoudas, Spyros
20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 799 - 806

← 1 2 3 4 5 →