QASAR: Self-Supervised Learning Framework for Extractive Question Answering

被引:3
|
作者
Assem, Haytham [1 ]
Sarkar, Iajdeep [1 ,2 ]
Dutta, Sourav [1 ]
机构
[1] Huawei Res, Dublin, Ireland
[2] Natl Univ Ireland Galway, Galway, Ireland
基金
爱尔兰科学基金会;
关键词
Question Answering; Self-Supervised Learning; Question Generation; Context Retrieval;
D O I
10.1109/BigData52589.2021.9671570
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Question Answering (QA) has become a foundational research area in Natural Language Understanding (NLU) with widespread applications in search, personal digital assistance, and conversational systems. Despite the success in open-domain question answering, existing extractive question answering models pre-trained using Wikipedia articles (e.g., SQuAD data) perform rather poorly in closed-domain and industrial scenarios. Further, a major limitation in adapting question answering systems to such contexts is the poor availability and the expensive annotation of domain-specific data. Thus, wide applicability of QA models are severely hampered in enterprise systems. In this paper, we aim to address the above challenges by introducing a novel QA framework, Qasar, using self-supervised learning for efficient domain adaptation. We show, for the first time, the advantage of finetuning pre-trained QA models for closed-domains by synthetically generated domain-specific questions and answers (from relevant documents) from large language models like T5. Further, we also propose a novel context retrieval component based on question-context semantic relatedness to further boost the accuracy of the Qasar QA framework. Experimental results show significant performance improvements on both openand closed-domain QA datasets, while requiring no labelling efforts, which we believe will contribute to the ease of deployment of such systems in enterprise settings. The different modules of our framework (synthetic data generation, context retrieval, and question answering) can be fully reproduced by fine-tuning publicly available language models and QA models on SQuAD dataset as discussed in the paper.
引用
收藏
页码:1797 / 1808
页数:12
相关论文
共 50 条
  • [31] Longitudinal self-supervised learning
    Zhao, Qingyu
    Liu, Zixuan
    Adeli, Ehsan
    Pohl, Kilian M.
    [J]. MEDICAL IMAGE ANALYSIS, 2021, 71
  • [32] Self-Supervised Learning for Electroencephalography
    Rafiei, Mohammad H.
    Gauthier, Lynne V.
    Adeli, Hojjat
    Takabi, Daniel
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 1457 - 1471
  • [33] Credal Self-Supervised Learning
    Lienen, Julian
    Huellermeier, Eyke
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [34] Self-Supervised Learning for Recommendation
    Huang, Chao
    Xia, Lianghao
    Wang, Xiang
    He, Xiangnan
    Yin, Dawei
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 5136 - 5139
  • [35] Self-supervised Learning and Self-labeling Framework for Retina Glaucoma Detection
    Rezaei, Mina
    Vahidi, Amirhossein
    Bischl, Bernd
    Wang, Mengyu
    Elze, Tobias
    Eslami, Mohammad
    [J]. INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)
  • [36] Quantum self-supervised learning
    Jaderberg, B.
    Anderson, L. W.
    Xie, W.
    Albanie, S.
    Kiffner, M.
    Jaksch, D.
    [J]. QUANTUM SCIENCE AND TECHNOLOGY, 2022, 7 (03):
  • [37] Self-supervised Visual Feature Learning and Classification Framework: Based on Contrastive Learning
    Wang, Zhibo
    Yan, Shen
    Zhang, Xiaoyu
    Lobo, Niels Da Vitoria
    [J]. 16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 719 - 725
  • [38] A New Self-supervised Method for Supervised Learning
    Yang, Yuhang
    Ding, Zilin
    Cheng, Xuan
    Wang, Xiaomin
    Liu, Ming
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER VISION, APPLICATION, AND DESIGN (CVAD 2021), 2021, 12155
  • [39] A self-supervised learning framework for classifying Microarray gene expression data
    Lu, Yijuan
    Tian, Qi
    Liu, Feng
    Sanchez, Maribel
    Wang, Yufeng
    [J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 2, PROCEEDINGS, 2006, 3992 : 686 - 693
  • [40] Inpaint2Learn: A Self-Supervised Framework for Affordance Learning
    Zhang, Lingzhi
    Du, Weiyu
    Zhou, Shenghao
    Wang, Jiancong
    Shi, Jianbo
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3778 - 3787