Ranking and Sampling in Open-Domain Question Answering

被引:0
|
作者
Xu, Yanfu [1 ,2 ]
Lin, Zheng [1 ]
Liu, Yuanxin [1 ,2 ]
Liu, Rui [1 ,2 ]
Wang, Weiping [1 ]
Meng, Dan [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Open-domain question answering (OpenQA) aims to answer questions based on a number of unlabeled paragraphs. Existing approaches always follow the distantly supervised setup where some of the paragraphs are wrong-labeled (noisy), and mainly utilize the paragraph-question relevance to denoise. However, the paragraph-paragraph relevance, which may aggregate the evidence among relevant paragraphs, can also be utilized to discover more useful paragraphs. Moreover, current approaches mainly focus on the positive paragraphs which are known to contain the answer during training. This will affect the generalization ability of the model and make it be disturbed by the similar but irrelevant (distracting) paragraphs during testing. In this paper, we first introduce a ranking model leveraging the paragraph-question and the paragraph-paragraph relevance to compute a confidence score for each paragraph. Furthermore, based on the scores, we design a modified weighted sampling strategy for training to mitigate the influence of the noisy and distracting paragraphs. Experiments on three public datasets (Quasar-T, SearchQA and TriviaQA) show that our model advances the state of the art.
引用
收藏
页码:2412 / 2421
页数:10
相关论文
共 50 条
  • [1] Neural Ranking with Weak Supervision for Open-Domain Question Answering : A Survey
    Shen, Xiaoyu
    Vakulenko, Svitlana
    del Tredici, Marco
    Barlacchi, Gianni
    Byrne, Bill
    de Gispert, Adria
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1736 - 1750
  • [2] Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering
    Lee, Jinhyuk
    Yun, Seongjun
    Kim, Hyunjae
    Ko, Miyoung
    Kang, Jaewoo
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 565 - 569
  • [3] Advances in open-domain question answering
    Zhang, Zhi-Chang
    Zhang, Yu
    Liu, Ting
    Li, Sheng
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2009, 37 (05): : 1058 - 1069
  • [4] Advances in question classification for open-domain question answering
    School of Computer Science and Technology, Anhui University of Technology, Maanshan
    Anhui
    243002, China
    不详
    Jiangsu
    210023, China
    [J]. Tien Tzu Hsueh Pao, 8 (1627-1636):
  • [5] Type checking in open-domain question answering
    Schlobach, S
    Olsthoorn, M
    de Rijke, M
    [J]. ECAI 2004: 16TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 110 : 398 - 402
  • [6] Open-domain textual question answering techniques
    Harabagiu, Sanda M.
    Maiorano, Steven J.
    Paşca, Marius A.
    [J]. Natural Language Engineering, 2003, 9 (03) : 231 - 267
  • [7] Passage filtering for open-domain Question Answering
    Noguera, Elisa
    Llopis, Fernando
    Ferrandez, Antonio
    [J]. ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4139 : 534 - 540
  • [8] A Light Ranker for Open-Domain Question Answering
    Qiu, Boyu
    Xu, Jungang
    Chen, Xu
    Sun, Yingfei
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [9] PyGaggle: A Gaggle of Resources for Open-Domain Question Answering
    Pradeep, Ronak
    Chen, Haonan
    Gu, Lingwei
    Tamber, Manveer Singh
    Lin, Jimmy
    [J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 148 - 162
  • [10] Adaptive Information Seeking for Open-Domain Question Answering
    Zhu, Yunchang
    Pang, Liang
    Lan, Yanyan
    Shen, Huawei
    Cheng, Xueqi
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3615 - 3626