Retrieval Data Augmentation Informed by Downstream Question Answering Performance

被引:0
|
作者
Ferguson, James [1 ]
Dasigi, Pradeep [2 ]
Khot, Tushar [2 ]
Hajishirzi, Hannaneh [1 ,2 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Allen Inst AI, Seattle, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training retrieval models to fetch contexts for Question Answering (QA) over large corpora requires labeling relevant passages in those corpora. Since obtaining exhaustive manual annotations of all relevant passages is not feasible, prior work uses text overlap heuristics to find passages that are likely to contain the answer, but this is not feasible when the task requires deeper reasoning and answers are not extractable spans (e.g.: multi-hop, discrete reasoning). We address this issue by identifying relevant passages based on whether they are useful for a trained QA model to arrive at the correct answers, and develop a search process guided by the QA model's loss. Our experiments show that this approach enables identifying relevant context for unseen data greater than 90% of the time on the IIRC dataset and generalizes better to the end QA task than those trained on just the gold retrieval data on IIRC and QASC datasets.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [31] Answer Retrieval in Legal Community Question Answering
    Askari, Arian
    Yang, Zihui
    Ren, Zhaochun
    Verberne, Suzan
    [J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT III, 2024, 14610 : 477 - 485
  • [32] Open-Retrieval Conversational Question Answering
    Qu, Chen
    Yang, Liu
    Chen, Cen
    Qiu, Minghui
    Croft, W. Bruce
    Iyyer, Mohit
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 539 - 548
  • [33] Adaptive Document Retrieval for Deep Question Answering
    Kratzwald, Bernhard
    Feuerriegel, Stefan
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 576 - 581
  • [34] Interactive Question Answering for Multimodal Lifelog Retrieval
    Ly-Duyen Tran
    Zhou, Liting
    Binh Nguyen
    Gurrin, Cathal
    [J]. MULTIMEDIA MODELING, MMM 2024, PT V, 2024, 14565 : 68 - 81
  • [35] Double Retrieval and Ranking for Accurate Question Answering
    Zhang, Zeyu
    Thuy Vu
    Moschitti, Alessandro
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1751 - 1762
  • [36] A passage retrieval system for multilingual question answering
    Soriano, JMG
    Gómez, MM
    Arnal, ES
    Rosso, P
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 443 - 450
  • [37] Disentangled Retrieval and Reasoning for Implicit Question Answering
    Liu, Qian
    Geng, Xiubo
    Wang, Yu
    Cambria, Erik
    Jiang, Daxin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (06) : 7804 - 7815
  • [38] AliMe DA: A Data Augmentation Framework for Question Answering in Cold-start Scenarios
    Xu, Guohai
    Shao, Yan
    Li, Chenliang
    Li, Feng-Lin
    Bi, Bin
    Zhang, Ji
    Chen, Haiqing
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2637 - 2638
  • [39] Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering
    Riabi, Arij
    Scialom, Thomas
    Keraron, Rachel
    Sagot, Benoit
    Seddah, Djame
    Staiano, Jacopo
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7016 - 7030
  • [40] Knowledge Informed Semantic Parsing for Conversational Question Answering
    Thirukovalluru, Raghuveer
    Sridhar, Mukund
    Dung Thai
    Chanumolu, Shruti
    Monath, Nicholas
    Ananthakrishnan, Shankar
    McCallum, Andrew
    [J]. REPL4NLP 2021: PROCEEDINGS OF THE 6TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP, 2021, : 231 - 240