PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text

被引:1
|
作者
Ye, Muchao [1 ]
Chen, Jinghui [1 ]
Miao, Chenglin [2 ]
Liu, Han [3 ]
Wang, Ting [1 ]
Ma, Fenglong [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Iowa State Univ, Ames, IA USA
[3] Dalian Univ Technol, Dalian, Liaoning, Peoples R China
基金
美国国家科学基金会;
关键词
hard-label adversarial attack; robustness of language model;
D O I
10.1145/3580305.3599461
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite a plethora of prior explorations, conducting text adversarial attacks in practical settings is still challenging with the following constraints: black box - the inner structure of the victim model is unknown; hard label - the attacker only has access to the top-1 prediction results; and semantic preservation - the perturbation needs to preserve the original semantics. In this paper, we present PAT,1 a novel adversarial attack method employed under all these constraints. Specifically, PAT explicitly models the adversarial and non-adversarial prototypes and incorporates them to measure semantic changes for replacement selection in the hard-label black-box setting to generate high-quality samples. In each iteration, PAT finds original words that can be replaced back and selects better candidate words for perturbed positions in a geometry-aware manner guided by this estimation, which maximally improves the perturbation construction and minimally impacts the original semantics. Extensive evaluation with benchmark datasets and state-of-the-art models shows that PAT outperforms existing text adversarial attacks in terms of both attack effectiveness and semantic preservation. Moreover, we validate the efficacy of PAT against industry-leading natural language processing platforms in real-world settings.
引用
收藏
页码:3093 / 3104
页数:12
相关论文
共 50 条
  • [1] HyGloadAttack: Hard-label black-box textual adversarial attacks via hybrid optimization
    Liu, Zhaorong
    Xiong, Xi
    Li, Yuanyuan
    Yu, Yan
    Lu, Jiazhong
    Zhang, Shuai
    Xiong, Fei
    NEURAL NETWORKS, 2024, 178
  • [3] Hard-label Black-box Universal Adversarial Patch Attack
    Tao, Guanhong
    An, Shengwei
    Cheng, Siyuan
    Shen, Guangyu
    Zhang, Xiangyu
    PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 697 - 714
  • [4] TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text
    Ye, Muchao
    Miao, Chenglin
    Wang, Ting
    Ma, Fenglong
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3877 - 3884
  • [5] HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
    Liu, Han
    Xu, Zhi
    Zhang, Xiaotong
    Zhang, Feng
    Ma, Fenglong
    Chen, Hongyang
    Yu, Hong
    Zhang, Xianchao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Query-Efficient Hard-Label Black-Box Attacks Using Biased Sampling
    Liu, Sijia
    Sun, Jian
    Li, Jun
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 3872 - 3877
  • [7] Semantic-Aware Adaptive Binary Search for Hard-Label Black-Box Attack
    Ma, Yiqing
    Lucke, Kyle
    Xian, Min
    Vakanski, Aleksandar
    COMPUTERS, 2024, 13 (08)
  • [8] DFDS: Data-Free Dual Substitutes Hard-Label Black-Box Adversarial Attack
    Jiang, Shuliang
    He, Yusheng
    Zhang, Rui
    Kang, Zi
    Xia, Hui
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT III, KSEM 2024, 2024, 14886 : 274 - 285
  • [9] Fuzzing-based hard-label black-box attacks against machine learning models
    Qin, Yi
    Yue, Chuan
    COMPUTERS & SECURITY, 2022, 117
  • [10] Efficient text-based evolution algorithm to hard-label adversarial attacks on text
    Peng, Hao
    Wang, Zhe
    Zhao, Dandan
    Wu, Yiming
    Han, Jianming
    Guo, Shixin
    Ji, Shouling
    Zhong, Ming
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (05)