Exploring Retriever-Reader Approaches in Question-Answering on Scientific Documents

被引:1
|
作者
Dieu-Hien Nguyen [1 ]
Nguyen-Khang Le [1 ]
Minh Le Nguyen [1 ]
机构
[1] Japan Adv Inst Sci & Technol, Nomi, Japan
关键词
Question-answering; Retriever-reader; Long sequences;
D O I
10.1007/978-981-19-8234-7_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As readers of scientific articles often read to answer specific questions, the task of Question-Answering (QA) in academic papers was proposed to evaluate the ability of intelligent systems to answer questions in long scientific documents. Due to the large contexts in the questions, this task poses many challenges to state-of-the-art QA models. This paper explores the retriever-reader approaches widely used in open-domain QA and their impact when adapting to QA on long scientific documents. By treating one scientific article as the corpus for retrieval, we propose a retriever-reader method to extract the answer from the relevant parts of the document and an effective sliding window technique that improves the pipeline by splitting the articles into disjoint text blocks of fixed size. Experiments on QASPER, a dataset for QA in Natural Language Processing papers, showed that our method outperforms all state-of-the-art models and establishes a new state-of-the-art in the extractive questions subset with 30.43% F1(1)The code and processed data are available at https://github.com/lekhang4497/qasper-retriever-reader.
引用
收藏
页码:383 / 395
页数:13
相关论文
共 17 条
  • [1] A Retriever-Reader Framework with Visual Entity Linking for Knowledge-Based Visual Question Answering
    You, Jiuxiang
    Yang, Zhenguo
    Li, Qing
    Liu, Wenyin
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 13 - 18
  • [2] Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering
    Ni, Jianmo
    Zhu, Chenguang
    Chen, Weizhu
    McAuley, Julian
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 335 - 344
  • [3] Sibyl, a Factoid Question-Answering System for Spoken Documents
    Comas, Pere R.
    Turmo, Jordi
    Marquez, Lluis
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2012, 30 (03)
  • [4] Recommending QA Documents for Communities of Question-Answering Websites
    Liu, Duen-Ren
    Huang, Chun-Kai
    Chen, Yu-Hsuan
    [J]. INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2013), PT II, 2013, 7803 : 139 - 147
  • [5] Extracting Tabular data for Question-Answering from Documents
    Jain, Palak
    Goel, Tushar
    Verma, Ishan
    Shakir, Mohammad
    Dey, Lipika
    Sharma, Geetika
    [J]. CODS-COMAD 2021: PROCEEDINGS OF THE 3RD ACM INDIA JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE & MANAGEMENT OF DATA (8TH ACM IKDD CODS & 26TH COMAD), 2021, : 400 - 404
  • [6] A Question-Answering System on COVID-19 Scientific Literature
    Raza, Shaina
    Schwartz, Brian
    Ondrusek, Nancy
    [J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1331 - 1336
  • [7] FigureNet : A Deep Learning model for Question-Answering on Scientific Plots
    Reddy, Revanth
    Ramesh, Rahul
    Deshpande, Ameet
    Khapra, Mitesh M.
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [8] Reading and question-answering strategies with multiple documents:: A think-aloud study
    Cerdan, Raquel
    [J]. INFANCIA Y APRENDIZAJE, 2007, 30 (01): : 55 - 71
  • [9] Towards Intelligent Legal Advisors for Document Retrieval and Question-Answering in German Legal Documents
    Hoppe, Christoph
    Pelkmann, David
    Migenda, Nico
    Hoette, Daniel
    Schenck, Wolfram
    [J]. 2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 29 - 32