Improving Biomedical Question Answering by Data Augmentation and Model Weighting

被引:2
|
作者
Du, Yongping [1 ]
Yan, Jingya [1 ]
Lu, Yuxuan [1 ]
Zhao, Yiliang [1 ]
Jin, Xingnan [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
基金
北京市自然科学基金; 国家重点研发计划;
关键词
Biological system modeling; Data models; Training; Task analysis; Predictive models; Context modeling; Training data; Biomedical question answering; data augmentation; deep learning; model weighting;
D O I
10.1109/TCBB.2022.3171388
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biomedical Question Answering aims to extract an answer to the given question from a biomedical context. Due to the strong professionalism of specific domain, it's more difficult to build large-scale datasets for specific domain question answering. Existing methods are limited by the lack of training data, and the performance is not as good as in open-domain settings, especially degrading when facing to the adversarial sample. We try to resolve the above issues. First, effective data augmentation strategies are adopted to improve the model training, including slide window, summarization and round-trip translation. Second, we propose a model weighting strategy for the final answer prediction in biomedical domain, which combines the advantage of two models, open-domain model QANet and BioBERT pre-trained in biomedical domain data. Finally, we give adversarial training to reinforce the robustness of the model. The public biomedical dataset collected from PubMed provided by BioASQ challenge is used to evaluate our approach. The results show that the model performance has been improved significantly compared to the single model and other models participated in BioASQ challenge. It can learn richer semantic expression from data augmentation and adversarial samples, which is beneficial to solve more complex question answering problems in biomedical domain.
引用
收藏
页码:1114 / 1124
页数:11
相关论文
共 50 条
  • [31] Biomedical Question Answering: A Survey of Approaches and Challenges
    Jin, Qiao
    Yuan, Zheng
    Xiong, Guangzhi
    Yu, Qianlan
    Ying, Huaiyuan
    Tan, Chuanqi
    Chen, Mosha
    Huang, Songfang
    Liu, Xiaozhong
    Yu, Sheng
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (02)
  • [32] Biomedical Question Answering: A Survey of Methods and Datasets
    Kaddari, Zakaria
    Mellah, Youssef
    Berrich, Jamal
    Bouchentouf, Toumi
    Belkasmi, Mohammed G.
    [J]. 2020 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS), 2020,
  • [33] Question answering summarization of multiple biomedical documents
    Shi, Zhongmin
    Melli, Gabor
    Wang, Yang
    Liu, Yudong
    Gu, Baohua
    Kashani, Mehdi M.
    Sarkar, Anoop
    Popowich, Fred
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4509 : 284 - +
  • [34] Biomedical question answering using semantic relations
    Hristovski, Dimitar
    Dinevski, Dejan
    Kastrin, Andrej
    Rindflesch, Thomas C.
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [35] Sequence tagging for biomedical extractive question answering
    Yoon, Wonjin
    Jackson, Richard
    Lagerberg, Aron
    Kang, Jaewoo
    [J]. BIOINFORMATICS, 2022, 38 (15) : 3794 - 3801
  • [36] Study on Question Answering System for Biomedical Domain
    Xu, Bo
    Lin, Hongfei
    Liu, Baoyan
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING ( GRC 2009), 2009, : 626 - 629
  • [37] Improving the Robustness of Question Answering Systems to Question Paraphrasing
    Gan, Wee Chung
    Ng, Hwee Tou
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 6065 - 6075
  • [38] A self-supervised language model selection strategy for biomedical question answering
    Arabzadeh, Negar
    Bagheri, Ebrahim
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 146
  • [39] Hierarchical Multi-layer Transfer Learning Model for Biomedical Question Answering
    Du, Yongping
    Pei, Bingbing
    Zhao, Xiaozheng
    Ji, Junzhong
    [J]. PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 362 - 367
  • [40] AliMe DA: A Data Augmentation Framework for Question Answering in Cold-start Scenarios
    Xu, Guohai
    Shao, Yan
    Li, Chenliang
    Li, Feng-Lin
    Bi, Bin
    Zhang, Ji
    Chen, Haiqing
    [J]. SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2637 - 2638