Evaluating Neural Model Robustness for Machine Comprehension

被引：0

作者：

Wu, Winston ^{[1
]}

Arendt, Dustin ^{[2
]}

Volkova, Svitlana ^{[3
]}

机构：

[1] Johns Hopkins Univ, Dept Comp Sci, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Pacific Northwest Natl Lab, Visual Analyt Grp, Richland, WA USA

[3] Pacific Northwest Natl Lab, Data Sci & Analyt Grp, Richland, WA USA

来源：

16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021) | 2021年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We evaluate neural model robustness to adversarial attacks using different types of linguistic unit perturbations - character and word, and propose a new method for strategic sentencelevel perturbations. We experiment with different amounts of perturbations to examine model confidence and misclassification rate, and contrast model performance with different embeddings BERT and ELMo on two benchmark datasets SQuAD and TriviaQA. We demonstrate how to improve model performance during an adversarial attack by using ensembles. Finally, we analyze factors that affect model behavior under adversarial attack, and develop a new model to predict errors during attacks. Our novel findings reveal that (a) unlike BERT, models that use ELMo embeddings are more susceptible to adversarial attacks, (b) unlike word and paraphrase, character perturbations affect the model the most but are most easily compensated for by adversarial training, (c) word perturbations lead to more high-confidence misclassifications compared to sentence- and character-level perturbations, (d) the type of question and model answer length (the longer the answer the more likely it is to be incorrect) is the most predictive of model errors in adversarial setting, and (e) conclusions about model behavior are dataset-specific.

引用

页码：2470 / 2481

页数：12

共 50 条

[11] Improving the robustness of machine reading comprehension via contrastive learning
Jianzhou Feng
Jiawei Sun
Di Shao
Jinman Cui
Applied Intelligence, 2023, 53 : 9103 - 9114
[12] AntiNODE: Evaluating Efficiency Robustness of Neural ODEs
Haque, Mirazul
Chen, Simin
Haque, Wasif
Liu, Cong
Yang, Wei
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 1499 - 1509
[13] Neural Machine Reading Comprehension: Methods and Trends
Liu, Shanshan
Zhang, Xin
Zhang, Sheng
Wang, Hui
Zhang, Weiming
APPLIED SCIENCES-BASEL, 2019, 9 (18):
[14] Semantics Altering Modifications for Evaluating Comprehension in Machine Reading
Schlegel, Viktor
Nenadic, Goran
Batista-Navarro, Riza
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 13762 - 13770
[15] Evaluating Machine Reading Systems through Comprehension Tests
Penas, Anselmo
Hovy, Eduard
Forner, Pamela
Rodrigo, Alvaro
Sutcliffe, Richard
Forascu, Corina
Sporleder, Caroline
LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1143 - 1147
[16] A Study of Morphological Robustness of Neural Machine Translation
Jayanthi, Sai Muralidhar
Pratapa, Adithya
SIGMORPHON 2021: 18TH SIGMORPHON WORKSHOP ON COMPUTATIONAL RESEARCH IN PHONETICS, PHONOLOGY, AND MORPHOLOGY, 2021, : 49 - 59
[17] Evaluating Explanation Robustness to Model Pruning
Tan, Hanxiao
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
[18] Evaluating the robustness of a biochemical network model
Ghaemi, Reza
Del Vecchio, Domitilla
PROCEEDINGS OF THE 46TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2007, : 2089 - 2094
[19] RobustNPR: Evaluating the robustness of neural program repair models
Ge, Hongliang
Zhong, Wenkang
Li, Chuanyi
Ge, Jidong
Hu, Hao
Luo, Bin
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (04)
[20] Evaluating Accuracy and Adversarial Robustness of Quanvolutional Neural Networks
Sooksatra, Korn
Rivas, Pablo
Orduz, Javier
2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 152 - 157

← 1 2 3 4 5 →