PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation

被引:0
|
作者
Zhu, Wei [1 ]
Zhou, Xiaofeng [1 ]
Wang, Keqiang [1 ]
Luo, Xun [1 ]
Li, Xiepeng [1 ]
Ni, Yuan [1 ]
Xie, Guotong [1 ]
机构
[1] Pingan Hlth Tech, Shanghai, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the models designated for the MEDIQA 2019 shared tasks by the team PANLP. We take advantages of the recent advances in pre-trained bidirectional transformer language models such as BERT (Devlin et al., 2018) and MT-DNN (Liu et al., 2019b). We find that pre-trained language models can significantly outperform traditional deep learning models. Transfer learning from the NLI task to the RQE task is also experimented, which proves to be useful in improving the results of fine-tuning MT-DNN large. A knowledge distillation process is implemented, to distill the knowledge contained in a set of models and transfer it into an single model, whose performance turns out to be comparable with that obtained by the ensemble of that set of models. Finally, for test submissions, model ensemble and a re-ranking process are implemented to boost the performances. Our models participated in all three tasks and ranked the 1st place for the RQE task, and the 2nd place for the NLI task, and also the 2nd place for the QA task.
引用
收藏
页码:380 / 388
页数:9
相关论文
共 50 条
  • [1] Dynamic Knowledge Distillation for Pre-trained Language Models
    Li, Lei
    Lin, Yankai
    Ren, Shuhuai
    Li, Peng
    Zhou, Jie
    Sun, Xu
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 379 - 389
  • [2] Knowledge Base Grounded Pre-trained Language Models via Distillation
    Sourty, Raphael
    Moreno, Jose G.
    Servant, Francois-Paul
    Tamine, Lynda
    [J]. 39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 1617 - 1625
  • [3] ReAugKD: Retrieval-Augmented Knowledge Distillation For Pre-trained Language Models
    Zhang, Jianyi
    Muhamed, Aashiq
    Anantharaman, Aditya
    Wang, Guoyin
    Chen, Changyou
    Zhong, Kai
    Cui, Qingjun
    Xu, Yi
    Zeng, Belinda
    Chilimbi, Trishul
    Chen, Yiran
    [J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1128 - 1136
  • [4] Meta Distant Transfer Learning for Pre-trained Language Models
    Wang, Chengyu
    Pan, Haojie
    Qiu, Minghui
    Yang, Fei
    Huang, Jun
    Zhang, Yin
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752
  • [5] Knowledge Inheritance for Pre-trained Language Models
    Qin, Yujia
    Lin, Yankai
    Yi, Jing
    Zhang, Jiajie
    Han, Xu
    Zhang, Zhengyan
    Su, Yusheng
    Liu, Zhiyuan
    Li, Peng
    Sun, Maosong
    Zhou, Jie
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3921 - 3937
  • [6] MERGEDISTILL: Merging Pre-trained Language Models using Distillation
    Khanuja, Simran
    Johnson, Melvin
    Talukdar, Partha
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2874 - 2887
  • [7] Probing Pre-Trained Language Models for Disease Knowledge
    Alghanmi, Israa
    Espinosa-Anke, Luis
    Schockaert, Steven
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
  • [8] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430
  • [9] KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation
    Tahaei, Marzieh S.
    Charlaix, Ella
    Nia, Vahid Partovi
    Ghodsi, Ali
    Rezagholizadeh, Mehdi
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2116 - 2127
  • [10] Explanation Guided Knowledge Distillation for Pre-trained Language Model Compression
    Yang, Zhao
    Zhang, Yuanzhe
    Sui, Dianbo
    Ju, Yiming
    Zhao, Jun
    Liu, Kang
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)