Reproducing a Neural Question Answering Architecture Applied to the SQuAD Benchmark Dataset: Challenges and Lessons Learned

被引:2
|
作者
Duer, Alexander [1 ]
Rauber, Andreas [1 ]
Filzmoser, Peter [1 ]
机构
[1] Vienna Univ Technol, Vienna, Austria
关键词
D O I
10.1007/978-3-319-76941-7_8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reproducibility is one of the pillars of scientific research. This study attempts to reproduce the Gated Self-Matching Network, which is the basis of one of the best performing models on the SQuAD dataset. We reimplement the neural network model and highlight ambiguities in the original architectural description. We show that due to uncertainty about only two components of the neural network model and no precise description of the training process, it is not possible to reproduce the experimental results obtained by the original implementation. Finally we summarize what we learned from this reproduction process about writing precise neural network architecture descriptions, providing our implementation as a basis for future exploration.
引用
收藏
页码:102 / 113
页数:12
相关论文
共 32 条
  • [1] Automatic Spanish Translation of the SQuAD Dataset for Multilingual Question Answering
    Carrino, Casimiro Pio
    Costa-jussa, Marta R.
    Fonollosa, Jose A. R.
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 5515 - 5523
  • [2] Question and Answer Classification in Czech Question Answering Benchmark Dataset
    Kusnirakova, Dasa
    Medved, Marek
    Horak, Ales
    [J]. PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 701 - 706
  • [3] Building a benchmark dataset for the Kurdish news question answering
    Saeed, Ari M.
    [J]. DATA IN BRIEF, 2024, 57
  • [4] EgoVQA - An Egocentric Video Question Answering Benchmark Dataset
    Fan, Chenyou
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4359 - 4366
  • [5] What is the ultimate question answering system? Lessons learned from existing question answering systems
    Loerch, UW
    Guesgen, HW
    [J]. Proceedings of the IASTED International Conference on Artificial Intelligence and Applications, Vols 1and 2, 2004, : 323 - 329
  • [6] DISFL-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering
    Gupta, Aditya
    Xu, Jiacheng
    Upadhyay, Shyam
    Yang, Diyi
    Faruqui, Manaal
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3309 - 3319
  • [7] NAAQA: A Neural Architecture for Acoustic Question Answering
    Abdelnour, Jerome
    Rouat, Jean
    Salvi, Giampiero
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4997 - 5009
  • [8] Czech Question Answering with Extended SQAD v3.0 Benchmark Dataset
    Sabol, Radoslav
    Medved, Marek
    Horak, Ales
    [J]. RASLAN 2019: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING, 2019, : 99 - 108
  • [9] Question-Answering in a Low-resourced Language: Benchmark Dataset and Models for Tigrinya
    Gaim, Fitsum
    Yang, Wonsuk
    Park, Hancheol
    Park, Jong C.
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11857 - 11870
  • [10] Event-Oriented Visual Question Answering: The E-VQA Dataset and Benchmark
    Yang, Zhenguo
    Xiang, Jiale
    You, Jiuxiang
    Li, Qing
    Liu, Wenyin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (10) : 10210 - 10223