PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text

被引:1
|
作者
Ye, Muchao [1 ]
Chen, Jinghui [1 ]
Miao, Chenglin [2 ]
Liu, Han [3 ]
Wang, Ting [1 ]
Ma, Fenglong [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Iowa State Univ, Ames, IA USA
[3] Dalian Univ Technol, Dalian, Liaoning, Peoples R China
基金
美国国家科学基金会;
关键词
hard-label adversarial attack; robustness of language model;
D O I
10.1145/3580305.3599461
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite a plethora of prior explorations, conducting text adversarial attacks in practical settings is still challenging with the following constraints: black box - the inner structure of the victim model is unknown; hard label - the attacker only has access to the top-1 prediction results; and semantic preservation - the perturbation needs to preserve the original semantics. In this paper, we present PAT,1 a novel adversarial attack method employed under all these constraints. Specifically, PAT explicitly models the adversarial and non-adversarial prototypes and incorporates them to measure semantic changes for replacement selection in the hard-label black-box setting to generate high-quality samples. In each iteration, PAT finds original words that can be replaced back and selects better candidate words for perturbed positions in a geometry-aware manner guided by this estimation, which maximally improves the perturbation construction and minimally impacts the original semantics. Extensive evaluation with benchmark datasets and state-of-the-art models shows that PAT outperforms existing text adversarial attacks in terms of both attack effectiveness and semantic preservation. Moreover, we validate the efficacy of PAT against industry-leading natural language processing platforms in real-world settings.
引用
收藏
页码:3093 / 3104
页数:12
相关论文
共 50 条
  • [21] Black-box Adversarial Attacks on Video Recognition Models
    Jiang, Linxi
    Ma, Xingjun
    Chen, Shaoxiang
    Bailey, James
    Jiang, Yu-Gang
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 864 - 872
  • [22] Black-box adversarial attacks by manipulating image attributes
    Wei, Xingxing
    Guo, Ying
    Li, Bo
    INFORMATION SCIENCES, 2021, 550 : 285 - 296
  • [23] Physical Black-Box Adversarial Attacks Through Transformations
    Jiang, Wenbo
    Li, Hongwei
    Xu, Guowen
    Zhang, Tianwei
    Lu, Rongxing
    IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (03) : 964 - 974
  • [24] Boosting Black-Box Adversarial Attacks with Meta Learning
    Fu, Junjie
    Sun, Jian
    Wang, Gang
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7308 - 7313
  • [25] A review of black-box adversarial attacks on image classification
    Zhu, Yanfei
    Zhao, Yaochi
    Hu, Zhuhua
    Luo, Tan
    He, Like
    NEUROCOMPUTING, 2024, 610
  • [26] Curls & Whey: Boosting Black-Box Adversarial Attacks
    Shi, Yucheng
    Wang, Siyu
    Han, Yahong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6512 - 6520
  • [27] Boundary Defense Against Black-box Adversarial Attacks
    Aithal, Manjushree B.
    Li, Xiaohua
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2349 - 2356
  • [28] Black-box Adversarial Attacks with Limited Queries and Information
    Ilyas, Andrew
    Engstrom, Logan
    Athalye, Anish
    Lin, Jessy
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [29] Black-box adversarial attacks by manipulating image attributes
    Wei, Xingxing
    Guo, Ying
    Li, Bo
    Information Sciences, 2021, 550 : 285 - 296
  • [30] Black-box Universal Adversarial Attack on Text Classifiers
    Zhang, Yu
    Shao, Kun
    Yang, Junan
    Liu, Hui
    2021 2ND ASIA CONFERENCE ON COMPUTERS AND COMMUNICATIONS (ACCC 2021), 2021, : 1 - 5