Improving query efficiency of black-box attacks via the preference of models

被引：1

作者：

Yang, Xiangyuan ^{[1
]}

Lin, Jie ^{[1
]}

Zhang, Hanlin ^{[2
]}

Zhao, Peng ^{[1
]}

机构：

[1] Xi'an Jiaotong Univ, Sch Comp Sci & Technol, Xian, Peoples R China

[2] Qingdao Univ, Qingdao, Peoples R China

来源：

INFORMATION SCIENCES | 2024年 / 678卷

关键词：

Black-box query attack; Gradient-aligned attack; Preference property; Gradient preference; ROBUSTNESS;

D O I：

10.1016/j.ins.2024.121013

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Black-box query attacks are effective at compromising deep-learning models using only the model's output. These attacks typically face challenges with low attack success rates (ASRs) when limited to fewer than ten queries per example. Recent approaches have improved ASRs due to the transferability of initial perturbations, yet they still suffer from inefficient querying. Our study introduces the Gradient-Aligned Attack (GAA) to enhance ASRs with minimal perturbation by focusing on the model's preference. We define a preference property where the generated adversarial example prefers to be misclassified as the wrong category with a high initial confidence. This property is further elucidated by the gradient preference, suggesting a positive correlation between the magnitude of a coefficient in a partial derivative and the norm of the derivative itself. Utilizing this, we devise the gradient-aligned CE (GACE) loss to precisely estimate gradients by aligning these coefficients between the surrogate and victim models, with coefficients assessed by the victim model's outputs. GAA, based on the GACE loss, also aims to achieve the smallest perturbation. Our tests on ImageNet, CIFAR10, and Imagga API show that GAA can increase ASRs by 25.7% and 40.3% for untargeted and targeted attacks respectively, while only needing minimally disruptive perturbations. Furthermore, the GACE loss reduces the number of necessary queries by up to 2.5x and enhances the transferability of advanced attacks by up to 14.2%, especially when using an ensemble surrogate model. Code is available at https:// github .com /HaloMoto /GradientAlignedAttack.

引用

页数：21

共 50 条

[41] SoK: Pitfalls in Evaluating Black-Box Attacks
Suya, Fnu
Suri, Anshuman
Zhang, Tingwei
Hong, Jingtao
Tian, Yuan
Evans, David
IEEE CONFERENCE ON SAFE AND TRUSTWORTHY MACHINE LEARNING, SATML 2024, 2024, : 387 - 407
[42] PRADA: Practical Black-box Adversarial Attacks against Neural Ranking Models
Wu, Chen
Zhang, Ruqing
Guo, Jiafeng
De Rijke, Maarten
Fan, Yixing
Cheng, Xueqi
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (04)
[43] Beating White-Box Defenses with Black-Box Attacks
Kumova, Vera
Pilat, Martin
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[44] Efficient Black-Box Adversarial Attacks for Deep Driving Maneuver Classification Models
Sarker, Ankur
Shen, Haiying
Sen, Tanmoy
Mendelson, Quincy
2021 IEEE 18TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2021), 2021, : 536 - 544
[45] Black-box attacks on face recognition via affine-invariant training
Bowen Sun
Hang Su
Shibao Zheng
Neural Computing and Applications, 2024, 36 : 8549 - 8564
[46] Black-box attacks on face recognition via affine-invariant training
Sun, Bowen
Su, Hang
Zheng, Shibao
NEURAL COMPUTING & APPLICATIONS, 2024, 36 (15): : 8549 - 8564
[47] Black-Box Attacks against Signed Graph Analysis via Balance Poisoning
Zhou, Jialong
Lai, Yuni
Ren, Jian
Zhou, Kai
2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 530 - 535
[48] Black-Box Adversarial Sample Attack for Query-Less Text Classification Models
Luo, Senlin
Cheng, Yao
Wan, Yunwei
Pan, Limin
Li, Xinshuai
Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2024, 44 (12): : 1277 - 1286
[49] Towards Query-efficient Black-box Adversarial Attack on Text Classification Models
Yadollahi, Mohammad Mehdi
Lashkari, Arash Habibi
Ghorbani, Ali A.
2021 18TH INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST (PST), 2021,
[50] Query-efficient black-box ensemble attack via dynamic surrogate weighting
Hu, Cong
He, Zhichao
Wu, Xiaojun
PATTERN RECOGNITION, 2025, 161

← 1 2 3 4 5 →