QASAR: Self-Supervised Learning Framework for Extractive Question Answering

被引:3
|
作者
Assem, Haytham [1 ]
Sarkar, Iajdeep [1 ,2 ]
Dutta, Sourav [1 ]
机构
[1] Huawei Res, Dublin, Ireland
[2] Natl Univ Ireland Galway, Galway, Ireland
基金
爱尔兰科学基金会;
关键词
Question Answering; Self-Supervised Learning; Question Generation; Context Retrieval;
D O I
10.1109/BigData52589.2021.9671570
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question Answering (QA) has become a foundational research area in Natural Language Understanding (NLU) with widespread applications in search, personal digital assistance, and conversational systems. Despite the success in open-domain question answering, existing extractive question answering models pre-trained using Wikipedia articles (e.g., SQuAD data) perform rather poorly in closed-domain and industrial scenarios. Further, a major limitation in adapting question answering systems to such contexts is the poor availability and the expensive annotation of domain-specific data. Thus, wide applicability of QA models are severely hampered in enterprise systems. In this paper, we aim to address the above challenges by introducing a novel QA framework, Qasar, using self-supervised learning for efficient domain adaptation. We show, for the first time, the advantage of finetuning pre-trained QA models for closed-domains by synthetically generated domain-specific questions and answers (from relevant documents) from large language models like T5. Further, we also propose a novel context retrieval component based on question-context semantic relatedness to further boost the accuracy of the Qasar QA framework. Experimental results show significant performance improvements on both openand closed-domain QA datasets, while requiring no labelling efforts, which we believe will contribute to the ease of deployment of such systems in enterprise settings. The different modules of our framework (synthetic data generation, context retrieval, and question answering) can be fully reproduced by fine-tuning publicly available language models and QA models on SQuAD dataset as discussed in the paper.
引用
收藏
页码:1797 / 1808
页数:12
相关论文
共 50 条
  • [1] SESAME - self-supervised framework for extractive question answering over document collections
    Batista, Vitor A.
    Gomes, Diogo S. M.
    Evsukoff, Alexandre
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024,
  • [2] elBERto: Self-supervised commonsense learning for question answering
    Zhan, Xunlin
    Li, Yuan
    Dong, Xiao
    Liang, Xiaodan
    Hu, Zhiting
    Carin, Lawrence
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [3] Self-supervised Dialogue Learning for Spoken Conversational Question Answering
    Chen, Nuo
    You, Chenyu
    Zou, Yuexian
    [J]. INTERSPEECH 2021, 2021, : 231 - 235
  • [4] Self-supervised Graph Contrastive Learning for Video Question Answering
    Yao, Xuan
    Gao, Jun-Yu
    Xu, Chang-Sheng
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2083 - 2100
  • [5] A multi-scale self-supervised hypergraph contrastive learning framework for video question answering
    Wang, Zheng
    Wu, Bin
    Ota, Kaoru
    Dong, Mianxiong
    Li, He
    [J]. NEURAL NETWORKS, 2023, 168 : 272 - 286
  • [6] Overcoming Language Priors with Self-supervised Learning for Visual Question Answering
    Zhi, Xi
    Mao, Zhendong
    Liu, Chunxiao
    Zhang, Peng
    Wang, Bin
    Zhang, Yongdong
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1083 - 1089
  • [7] Simple contrastive learning in a self-supervised manner for robust visual question answering
    Yang, Shuwen
    Xiao, Luwei
    Wu, Xingjiao
    Xu, Junjie
    Wang, Linlin
    He, Liang
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 241
  • [8] Self-Supervised Knowledge Triplet Learning for Zero-Shot Question Answering
    Banerjee, Pratyay
    Baral, Chitta
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 151 - 162
  • [9] ASCL: Adaptive self-supervised counterfactual learning for robust visual question answering
    Shu, Xinyao
    Yan, Shiyang
    Yang, Xu
    Wu, Ziheng
    Chen, Zhongfeng
    Lu, Zhenyu
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 248
  • [10] Self-Supervised Learning for Contextualized Extractive Summarization
    Wang, Hong
    Wang, Xin
    Xiong, Wenhan
    Yu, Mo
    Guo, Xiaoxiao
    Chang, Shiyu
    Wang, William Yang
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2221 - 2227