Exploring Pre-trained Language Models for Vocabulary Alignment in the UMLS

被引:0
|
作者
Hao, Xubing [1 ]
Abeysinghe, Rashmie [2 ]
Shi, Jay [3 ]
Cui, Licong [1 ]
机构
[1] Univ Texas Hlth Sci Ctr Houston, McWilliams Sch Biomed Informat, Houston, TX 77030 USA
[2] Univ Texas Hlth Sci Ctr Houston, Dept Neurol, Houston, TX 77030 USA
[3] Intermt Healthcare, Denver, CO 80206 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
UMLS Metathesaurus; Pre-trained Language Models; Vocabulary Alignment;
D O I
10.1007/978-3-031-66538-7_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Unified Medical Language System (UMLS) Metathesaurus integrates and aligns terms from hundreds of biomedical vocabularies. In this paper, we investigate the efficacy of Pre-trained Language Models (PLMs) for vocabulary alignment in the UMLS Metathesaurus. We frame the problem as two Natural Language Processing tasks: Text Classification and Text Generation. We fine-tune four opensource cutting-edge PLMs including BERT and RoBERTa, GPT-2, and BLOOM. Experiments show that the best model is RoBERTa achieving a precision, recall, and F1 score of 0.965, 0.940, and 0.952 respectively. In addition, incorporation of contextual information in the inputs improves the model performance in the Text Classification task, albeit with a limited impact on the Text Generation task. Domain expert evaluation of 100 randomly selected instances generated by the best model revealed that 78 of them as valid synonymous terms, indicating the promise of PLMs in enhancing the mapping quality of the UMLS Metathesaurus.
引用
收藏
页码:273 / 278
页数:6
相关论文
共 50 条
  • [1] Exploring Lottery Prompts for Pre-trained Language Models
    Chen, Yulin
    Ding, Ning
    Wang, Xiaobin
    Hu, Shengding
    Zheng, Hai-Tao
    Liu, Zhiyuan
    Xie, Pengjun
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15428 - 15444
  • [2] Exploring Pre-trained Language Models for Event Extraction and Generation
    Yang, Sen
    Feng, Dawei
    Qiao, Linbo
    Kan, Zhigang
    Li, Dongsheng
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5284 - 5294
  • [3] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    [J]. ENGINEERING, 2023, 25 : 51 - 65
  • [4] Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
    Yu, Dian
    Yu, Zhou
    Sagae, Kenji
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 2251 - 2268
  • [5] Annotating Columns with Pre-trained Language Models
    Suhara, Yoshihiko
    Li, Jinfeng
    Li, Yuliang
    Zhang, Dan
    Demiralp, Cagatay
    Chen, Chen
    Tan, Wang-Chiew
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 1493 - 1503
  • [6] LaoPLM: Pre-trained Language Models for Lao
    Lin, Nankai
    Fu, Yingwen
    Yang, Ziyu
    Chen, Chuwei
    Jiang, Shengyi
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6506 - 6512
  • [7] Exploring Accurate and Generic Simile Knowledge from Pre-trained Language Models
    Zhou, Shuhan
    Ma, Longxuan
    Shao, Yanqiu
    [J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 348 - 363
  • [8] PhoBERT: Pre-trained language models for Vietnamese
    Dat Quoc Nguyen
    Anh Tuan Nguyen
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1037 - 1042
  • [9] HinPLMs: Pre-trained Language Models for Hindi
    Huang, Xixuan
    Lin, Nankai
    Li, Kexin
    Wang, Lianxi
    Gan, Suifu
    [J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
  • [10] Evaluating Commonsense in Pre-Trained Language Models
    Zhou, Xuhui
    Zhang, Yue
    Cui, Leyang
    Huang, Dandan
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9733 - 9740