PAT: Geometry-Aware Hard-Label Black-Box Adversarial Attacks on Text

被引:1
|
作者
Ye, Muchao [1 ]
Chen, Jinghui [1 ]
Miao, Chenglin [2 ]
Liu, Han [3 ]
Wang, Ting [1 ]
Ma, Fenglong [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
[2] Iowa State Univ, Ames, IA USA
[3] Dalian Univ Technol, Dalian, Liaoning, Peoples R China
基金
美国国家科学基金会;
关键词
hard-label adversarial attack; robustness of language model;
D O I
10.1145/3580305.3599461
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite a plethora of prior explorations, conducting text adversarial attacks in practical settings is still challenging with the following constraints: black box - the inner structure of the victim model is unknown; hard label - the attacker only has access to the top-1 prediction results; and semantic preservation - the perturbation needs to preserve the original semantics. In this paper, we present PAT,1 a novel adversarial attack method employed under all these constraints. Specifically, PAT explicitly models the adversarial and non-adversarial prototypes and incorporates them to measure semantic changes for replacement selection in the hard-label black-box setting to generate high-quality samples. In each iteration, PAT finds original words that can be replaced back and selects better candidate words for perturbed positions in a geometry-aware manner guided by this estimation, which maximally improves the perturbation construction and minimally impacts the original semantics. Extensive evaluation with benchmark datasets and state-of-the-art models shows that PAT outperforms existing text adversarial attacks in terms of both attack effectiveness and semantic preservation. Moreover, we validate the efficacy of PAT against industry-leading natural language processing platforms in real-world settings.
引用
收藏
页码:3093 / 3104
页数:12
相关论文
共 50 条
  • [31] White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks
    Gil, Yotam
    Chai, Yoav
    Gorodissky, Or
    Berant, Jonathan
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1373 - 1379
  • [32] Black-box adversarial attacks on XSS attack detection model
    Wang, Qiuhua
    Yang, Hui
    Wu, Guohua
    Choo, Kim-Kwang Raymond
    Zhang, Zheng
    Miao, Gongxun
    Ren, Yizhi
    COMPUTERS & SECURITY, 2022, 113
  • [33] Black-box transferable adversarial attacks based on ensemble advGAN
    Huang S.-N.
    Li Y.-X.
    Mao Y.-H.
    Ban A.-Y.
    Zhang Z.-Y.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2022, 52 (10): : 2391 - 2398
  • [34] Black-Box Adversarial Attacks against Audio Forensics Models
    Jiang, Yi
    Ye, Dengpan
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [35] AutoAttacker: A reinforcement learning approach for black-box adversarial attacks
    Tsingenopoulos, Ilias
    Preuveneers, Davy
    Joosen, Wouter
    2019 4TH IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (EUROS&PW), 2019, : 229 - 237
  • [36] Query-based Local Black-box Adversarial Attacks
    Shi, Jing
    Zhang, Xiaolin
    Xu, Enhui
    Wang, Yongping
    Zhang, Wenwen
    International Journal of Network Security, 2023, 25 (06) : 1048 - 1058
  • [37] Simple Black-Box Adversarial Attacks on Deep Neural Networks
    Narodytska, Nina
    Kasiviswanathan, Shiva
    2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 1310 - 1318
  • [38] Heuristic Black-Box Adversarial Attacks on Video Recognition Models
    Wei, Zhipeng
    Chen, Jingjing
    Wei, Xingxing
    Jiang, Linxi
    Chua, Tat-Seng
    Zhou, Fengfeng
    Jiang, Yu-Gang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 12338 - 12345
  • [39] Adaptive Temporal Grouping for Black-box Adversarial Attacks on Videos
    Wei, Zhipeng
    Chen, Jingjing
    Zhang, Hao
    Jiang, Linxi
    Jiang, Yu-Gang
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 587 - 593
  • [40] LeapAttack: Hard-Label Adversarial Attack on Text via Gradient-Based Optimization
    Ye, Muchao
    Chen, Jinghui
    Miao, Chenglin
    Wang, Ting
    Ma, Fenglong
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2307 - 2315