Evaluating and Enhancing the Robustness of Sustainable Neural Relationship Classifiers Using Query-Efficient Black-Box Adversarial Attacks

被引：2

作者：

Haq, Ijaz Ul ^{[1
]}

Khan, Zahid Younas ^{[1
,2
]}

Ahmad, Arshad ^{[3
]}

Hayat, Bashir ^{[4
]}

Khan, Asif ^{[1
]}

Lee, Ye-Eun ^{[5
]}

Kim, Ki-Il ^{[5
]}

机构：

[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 10081, Peoples R China

[2] Univ Azad Jammu & Kashmir, Dept Comp Sci & Informat Technol, Muzaffarabad 13100, Pakistan

[3] Pak Austria Fachhsch Inst Appl Sci & Technol, Dept IT & Comp Sci, Haripur 22620, Pakistan

[4] Inst Management Sci Peshawar, Peshawar 25100, Pakistan

[5] Chungnam Natl Univ, Dept Comp Sci & Engn, Daejeon 34134, South Korea

来源：

SUSTAINABILITY | 2021年 / 13卷 / 11期

关键词：

robust; sustainability; adversarial attack; black-box attack; TFIDF; relation extraction; deep neural networks; RELATION EXTRACTION; RECOMMENDATION; REVIEWS; MODEL;

D O I：

10.3390/su13115892

中图分类号：

X [环境科学、安全科学];

学科分类号：

08 ; 0830 ;

摘要：

Neural relation extraction (NRE) models are the backbone of various machine learning tasks, including knowledge base enrichment, information extraction, and document summarization. Despite the vast popularity of these models, their vulnerabilities remain unknown; this is of high concern given their growing use in security-sensitive applications such as question answering and machine translation in the aspects of sustainability. In this study, we demonstrate that NRE models are inherently vulnerable to adversarially crafted text that contains imperceptible modifications of the original but can mislead the target NRE model. Specifically, we propose a novel sustainable term frequency-inverse document frequency (TFIDF) based black-box adversarial attack to evaluate the robustness of state-of-the-art CNN, CGN, LSTM, and BERT-based models on two benchmark RE datasets. Compared with white-box adversarial attacks, black-box attacks impose further constraints on the query budget; thus, efficient black-box attacks remain an open problem. By applying TFIDF to the correctly classified sentences of each class label in the test set, the proposed query-efficient method achieves a reduction of up to 70% in the number of queries to the target model for identifying important text items. Based on these items, we design both character- and word-level perturbations to generate adversarial examples. The proposed attack successfully reduces the accuracy of six representative models from an average F1 score of 80% to below 20%. The generated adversarial examples were evaluated by humans and are considered semantically similar. Moreover, we discuss defense strategies that mitigate such attacks, and the potential countermeasures that could be deployed in order to improve sustainability of the proposed scheme.

引用

页数：25

共 50 条

[1] Query-Efficient Black-Box Adversarial Attacks on Automatic Speech Recognition
Tong, Chuxuan
Zheng, Xi
Li, Jianhua
Ma, Xingjun
Gao, Longxiang
Xiang, Yong
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3981 - 3992
[2] Query-Efficient Black-Box Adversarial Attacks Guided by a Transfer-Based Prior
Dong, Yinpeng
Cheng, Shuyu
Pang, Tianyu
Su, Hang
Zhu, Jun
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9536 - 9548
[3] Evaluation of Four Black-box Adversarial Attacks and Some Query-efficient Improvement Analysis
Wang, Rui
[J]. 2022 PROGNOSTICS AND HEALTH MANAGEMENT CONFERENCE, PHM-LONDON 2022, 2022, : 298 - 302
[4] Sparse-RS: A Versatile Framework for Query-Efficient Sparse Black-Box Adversarial Attacks
Croce, Francesco
Andriushchenko, Maksym
Singh, Naman D.
Flammarion, Nicolas
Hein, Matthias
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6437 - 6445
[5] Query-Efficient Black-Box Adversarial Attack with Random Pattern Noises
Yuito, Makoto
Suzuki, Kenta
Yoneyama, Kazuki
[J]. INFORMATION AND COMMUNICATIONS SECURITY, ICICS 2022, 2022, 13407 : 303 - 323
[6] GenDroid: A query-efficient black-box android adversarial attack framework
Xu, Guangquan
Shao, Hongfei
Cui, Jingyi
Bai, Hongpeng
Li, Jiliang
Bai, Guangdong
Liu, Shaoying
Meng, Weizhi
Zheng, Xi
[J]. COMPUTERS & SECURITY, 2023, 132
[7] Query-Efficient Black-Box Adversarial Attack With Customized Iteration and Sampling
Shi, Yucheng
Han, Yahong
Hu, Qinghua
Yang, Yi
Tian, Qi
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2226 - 2245
[8] Simulating Unknown Target Models for Query-Efficient Black-box Attacks
Ma, Chen
Chen, Li
Yong, Jun-Hai
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11830 - 11839
[9] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval
Li, Xiaodan
Li, Jinfeng
Chen, Yuefeng
Ye, Shaokai
He, Yuan
Wang, Shuhui
Su, Hang
Xue, Hui
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3329 - 3338
[10] Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization
Lee, Deokjae
Moon, Seungyong
Lee, Junhyeok
Song, Hyun Oh
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →