Retrieval Data Augmentation Informed by Downstream Question Answering Performance

被引:0
|
作者
Ferguson, James [1 ]
Dasigi, Pradeep [2 ]
Khot, Tushar [2 ]
Hajishirzi, Hannaneh [1 ,2 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Allen Inst AI, Seattle, WA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training retrieval models to fetch contexts for Question Answering (QA) over large corpora requires labeling relevant passages in those corpora. Since obtaining exhaustive manual annotations of all relevant passages is not feasible, prior work uses text overlap heuristics to find passages that are likely to contain the answer, but this is not feasible when the task requires deeper reasoning and answers are not extractable spans (e.g.: multi-hop, discrete reasoning). We address this issue by identifying relevant passages based on whether they are useful for a trained QA model to arrive at the correct answers, and develop a search process guided by the QA model's loss. Our experiments show that this approach enables identifying relevant context for unseen data greater than 90% of the time on the IIRC dataset and generalizes better to the end QA task than those trained on just the gold retrieval data on IIRC and QASC datasets.
引用
收藏
页码:1 / 5
页数:5
相关论文
共 50 条
  • [1] Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies
    Parvez, Md Rizwan
    Chi, Jianfeng
    Ahmad, Wasi Uddin
    Tian, Yuan
    Chang, Kai-Wei
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 201 - 210
  • [2] Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation
    Yang, Yinfei
    Jin, Ning
    Lin, Kuo
    Guo, Mandy
    Cer, Daniel
    [J]. ACL-IJCNLP 2021: THE 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 2, 2021, : 263 - 268
  • [3] Data Augmentation Method for Question Answering
    Ding, Jiajie
    Xiao, Kang
    Ye, Heng
    Zhou, Xiabing
    Zhang, Min
    [J]. Beijing Daxue Xuebao (Ziran Kexue Ban)/Acta Scientiarum Naturalium Universitatis Pekinensis, 2022, 58 (01): : 54 - 60
  • [4] Data Augmentation for Biomedical Factoid Question Answering
    Pappas, Dimitris
    Malakasiotis, Prodromos
    Androutsopoulos, Ion
    [J]. PROCEEDINGS OF THE 21ST WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2022), 2022, : 63 - 81
  • [5] Rethinking Data Augmentation for Robust Visual Question Answering
    Chen, Long
    Zheng, Yuhang
    Xiao, Jun
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 95 - 112
  • [6] Learning Distributed Representations of Data in Community Question Answering for Question Retrieval
    Zhang, Kai
    Wu, Wei
    Wang, Fang
    Zhou, Ming
    Li, Zhoujun
    [J]. PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16), 2016, : 533 - 542
  • [7] A Crowdsourcing Tool for Data Augmentation in Visual Question Answering Tasks
    Silva, Ramon
    Fonseca, Augusto
    Goldschmidt, Ronaldo
    dos Santos, Joel
    Bezerra, Eduardo
    [J]. WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2018, : 137 - 140
  • [8] Improving Biomedical Question Answering by Data Augmentation and Model Weighting
    Du, Yongping
    Yan, Jingya
    Lu, Yuxuan
    Zhao, Yiliang
    Jin, Xingnan
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (02) : 1114 - 1124
  • [9] Structured retrieval for question answering
    Bilotti, Matthew W.
    Ogilvie, Paul
    Callan, Jamie
    Nyberg, Eric
    [J]. Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07, 2007, : 351 - 358
  • [10] Effectiveness of Data Augmentation to Identify Relevant Reviews for Product Question Answering
    Roy, Kalyani
    Goel, Avani
    Goyal, Pawan
    [J]. COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 298 - 301