Evaluating Neural Model Robustness for Machine Comprehension

被引:0
|
作者
Wu, Winston [1 ]
Arendt, Dustin [2 ]
Volkova, Svitlana [3 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Pacific Northwest Natl Lab, Visual Analyt Grp, Richland, WA USA
[3] Pacific Northwest Natl Lab, Data Sci & Analyt Grp, Richland, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We evaluate neural model robustness to adversarial attacks using different types of linguistic unit perturbations - character and word, and propose a new method for strategic sentencelevel perturbations. We experiment with different amounts of perturbations to examine model confidence and misclassification rate, and contrast model performance with different embeddings BERT and ELMo on two benchmark datasets SQuAD and TriviaQA. We demonstrate how to improve model performance during an adversarial attack by using ensembles. Finally, we analyze factors that affect model behavior under adversarial attack, and develop a new model to predict errors during attacks. Our novel findings reveal that (a) unlike BERT, models that use ELMo embeddings are more susceptible to adversarial attacks, (b) unlike word and paraphrase, character perturbations affect the model the most but are most easily compensated for by adversarial training, (c) word perturbations lead to more high-confidence misclassifications compared to sentence- and character-level perturbations, (d) the type of question and model answer length (the longer the answer the more likely it is to be incorrect) is the most predictive of model errors in adversarial setting, and (e) conclusions about model behavior are dataset-specific.
引用
收藏
页码:2470 / 2481
页数:12
相关论文
共 50 条
  • [31] Frame-based Neural Network for Machine Reading Comprehension
    Guo, Shaoru
    Guan, Yong
    Tan, Hongye
    Li, Ru
    Li, Xiaoli
    KNOWLEDGE-BASED SYSTEMS, 2021, 219 (219)
  • [32] MADE: A Universal Fine-Tuning Framework to Enhance Robustness of Machine Reading Comprehension
    Cao, Yang
    Wang, Yinglin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT X, 2024, 15025 : 203 - 219
  • [33] Evaluating Comprehension of an Argumentative Text Using a Strategic Model
    Delgado, Angel
    Fuenmayor de Vilchez, Gloria
    Garcia, Donaldo
    TELOS-REVISTA DE ESTUDIOS INTERDISCIPLINARIOS EN CIENCIAS SOCIALES, 2007, 9 (02): : 243 - 254
  • [34] Scalable rapid framework for evaluating network worst robustness with machine learning
    Jiang, Wenjun
    Li, Peiyan
    Fan, Tianlong
    Li, Ting
    Zhang, Chuan-fu
    Zhang, Tao
    Luo, Zong-fu
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 252
  • [35] Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?
    Pan, Leiyu
    Supryadi
    Xiong, Deyi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 14114 - 14125
  • [36] Understanding and Improving the Robustness of Terminology Constraints in Neural Machine Translation
    Zhang, Huaao
    Wang, Qiang
    Qin, Bo
    Shi, Zelin
    Wang, Haibo
    Chen, Ming
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 6029 - 6042
  • [37] Evaluating Explanation Methods for Neural Machine Translation
    Li, Jierui
    Liu, Lemao
    Li, Huayang
    Li, Guanlin
    Huang, Guoping
    Shi, Shuming
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 365 - 375
  • [38] Evaluating Structural Generalization in Neural Machine Translation
    Kumon, Ryoma
    Matsuoka, Daiki
    Yanaka, Hitomi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 13220 - 13239
  • [39] EVALUATING PROJECT ROBUSTNESS THROUGH THE LENS OF THE BUSINESS MODEL
    Reginato, Justin
    INTERNATIONAL JOURNAL OF INNOVATION AND TECHNOLOGY MANAGEMENT, 2009, 6 (02) : 155 - 167
  • [40] Evaluating Fuzzy Controller Robustness Using Model Checking
    Della Penna, Giuseppe
    Intrigila, Benedetto
    Magazzeni, Daniele
    FUZZY LOGIC AND APPLICATIONS, 2009, 5571 : 303 - +