Improving query efficiency of black-box attacks via the preference of models

被引:1
|
作者
Yang, Xiangyuan [1 ]
Lin, Jie [1 ]
Zhang, Hanlin [2 ]
Zhao, Peng [1 ]
机构
[1] Xi'an Jiaotong Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Qingdao Univ, Qingdao, Peoples R China
关键词
Black-box query attack; Gradient-aligned attack; Preference property; Gradient preference; ROBUSTNESS;
D O I
10.1016/j.ins.2024.121013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Black-box query attacks are effective at compromising deep-learning models using only the model's output. These attacks typically face challenges with low attack success rates (ASRs) when limited to fewer than ten queries per example. Recent approaches have improved ASRs due to the transferability of initial perturbations, yet they still suffer from inefficient querying. Our study introduces the Gradient-Aligned Attack (GAA) to enhance ASRs with minimal perturbation by focusing on the model's preference. We define a preference property where the generated adversarial example prefers to be misclassified as the wrong category with a high initial confidence. This property is further elucidated by the gradient preference, suggesting a positive correlation between the magnitude of a coefficient in a partial derivative and the norm of the derivative itself. Utilizing this, we devise the gradient-aligned CE (GACE) loss to precisely estimate gradients by aligning these coefficients between the surrogate and victim models, with coefficients assessed by the victim model's outputs. GAA, based on the GACE loss, also aims to achieve the smallest perturbation. Our tests on ImageNet, CIFAR10, and Imagga API show that GAA can increase ASRs by 25.7% and 40.3% for untargeted and targeted attacks respectively, while only needing minimally disruptive perturbations. Furthermore, the GACE loss reduces the number of necessary queries by up to 2.5x and enhances the transferability of advanced attacks by up to 14.2%, especially when using an ensemble surrogate model. Code is available at https:// github .com /HaloMoto /GradientAlignedAttack.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Query-Efficient Black-Box Red Teaming via Bayesian Optimization
    Lee, Deokjae
    Lee, JunYeong
    Ha, Jung-Woo
    Kim, Jin-Hwa
    Lee, Sang-Woo
    Lee, Hwaran
    Song, Hyun Oh
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11551 - 11574
  • [32] Query-Efficient Hard-Label Black-Box Attacks Using Biased Sampling
    Liu, Sijia
    Sun, Jian
    Li, Jun
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 3872 - 3877
  • [33] Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes
    Shukla, Satya Narayan
    Sahu, Anit Kumar
    Willmott, Devin
    Kolter, Zico
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1461 - 1469
  • [34] Query-Efficient Black-Box Adversarial Attacks Guided by a Transfer-Based Prior
    Dong, Yinpeng
    Cheng, Shuyu
    Pang, Tianyu
    Su, Hang
    Zhu, Jun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9536 - 9548
  • [35] Blacklight: Scalable Defense for Neural Networks against Query-Based Black-Box Attacks
    Li, Huiying
    Shan, Shawn
    Wenger, Emily
    Zhang, Jiayun
    Zheng, Haitao
    Zhao, Ben Y.
    PROCEEDINGS OF THE 31ST USENIX SECURITY SYMPOSIUM, 2022, : 2117 - 2134
  • [36] Evaluation of Four Black-box Adversarial Attacks and Some Query-efficient Improvement Analysis
    Wang, Rui
    2022 PROGNOSTICS AND HEALTH MANAGEMENT CONFERENCE, PHM-LONDON 2022, 2022, : 298 - 302
  • [37] On the Effectiveness of Small Input Noise for Defending Against Query-based Black-Box Attacks
    Byun, Junyoung
    Go, Hyojun
    Kim, Changick
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3819 - 3828
  • [38] Black-Box Data Poisoning Attacks on Crowdsourcing
    Chen, Pengpeng
    Yang, Yongqiang
    Yang, Dingqi
    Sun, Hailong
    Chen, Zhijun
    Lin, Peng
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2975 - 2983
  • [39] Toward Visual Distortion in Black-Box Attacks
    Li, Nannan
    Chen, Zhenzhong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6156 - 6167
  • [40] Resiliency of SNN on Black-Box Adversarial Attacks
    Paudel, Bijay Raj
    Itani, Aashish
    Tragoudas, Spyros
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 799 - 806