Generating Natural Language Adversarial Examples on a Large Scale with Generative Models

被引：5

作者：

Ren, Yankun ^{[1
]}

Lin, Jianbin ^{[1
]}

Tang, Siliang ^{[2
]}

Zhou, Jun ^{[1
]}

Yang, Shuang ^{[1
]}

Qi, Yuan ^{[1
]}

Ren, Xiang ^{[3
]}

机构：

[1] Ant Financial Serv Grp, Hangzhou, Peoples R China

[2] Zhejiang Univ, Hangzhou, Peoples R China

[3] Univ Southern Calif, Los Angeles, CA 90007 USA

来源：

ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2020年 / 325卷

关键词：

D O I：

10.3233/FAIA200340

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Today text classification models have been widely used. However, these classifiers are found to be easily fooled by adversarial examples. Fortunately, standard attacking methods generate adversarial texts in a pair-wise way, that is, an adversarial text can only be created from a real-world text by replacing a few words. In many applications, these texts are limited in numbers, therefore their corresponding adversarial examples are often not diverse enough and sometimes hard to read, thus can be easily detected by humans and cannot create chaos at a large scale. In this paper, we propose an end to end solution to efficiently generate adversarial texts from scratch using generative models, which are not restricted to perturbing the given texts. We call it unrestricted adversarial text generation. Specifically, we train a conditional variational autoencoder (VAE) with an additional adversarial loss to guide the generation of adversarial examples. Moreover, to improve the validity of adversarial texts, we utilize discrimators and the training framework of generative adversarial networks (GANs) to make adversarial texts consistent with real data. Experimental results on sentiment analysis demonstrate the scalability and efficiency of our method. It can attack text classification models with a higher success rate than existing methods, and provide acceptable quality for humans in the meantime.

引用

页码：2156 / 2163

页数：8

共 50 条

[1] Generating adversarial examples with collaborative generative models
Xu, Lei
Zhai, Junhai
INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2024, 23 (02) : 1077 - 1091
[2] Generating adversarial examples with collaborative generative models
Lei Xu
Junhai Zhai
International Journal of Information Security, 2024, 23 : 1077 - 1091
[3] Generating Natural Language Adversarial Examples
Alzantot, Moustafa
Sharma, Yash
Elgohary, Ahmed
Ho, Bo-Jhang
Srivastava, Mani B.
Chang, Kai-Wei
2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2890 - 2896
[4] MESDeceiver: Efficiently Generating Natural Language Adversarial Examples
Zhao, Tengfei
Ge, Zhaocheng
Hu, Hanping
Shi, Dingmeng
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[5] Generating Adversarial Examples With Conditional Generative Adversarial Net
Yu, Ping
Song, Kaitao
Lu, Jianfeng
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 676 - 681
[6] Adversarial examples for generative models
Kos, Jernej
Fischer, Ian
Song, Dawn
2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 36 - 42
[7] AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text
Shao, Zhihong
Wu, Zhongqin
Huang, Minlie
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1184 - 1196
[8] Visual Adversarial Examples Jailbreak Aligned Large Language Models
Princeton University, United States
Proc. AAAI Conf. Artif. Intell., 19 (21527-21536):
[9] Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency
Ren, Shuhuai
Deng, Yihe
He, Kun
Che, Wanxiang
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1085 - 1097
[10] Constructing Unrestricted Adversarial Examples with Generative Models
Song, Yang
Shu, Rui
Kushman, Nate
Ermon, Stefano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31

← 1 2 3 4 5 →