Domain adaptation for semantic role labeling of clinical text

被引:14
|
作者
Zhang, Yaoyun [1 ]
Tang, Buzhou [1 ,2 ]
Jiang, Min [1 ]
Wang, Jingqi [1 ]
Xu, Hua [1 ]
机构
[1] Univ Texas Houston, Sch Biomed Informat Houston, Houston, TX 77030 USA
[2] Shenzhen Grad Sch, Harbin Inst Technol, Dept Comp Sci, Shenzhen, Guangdong, Peoples R China
关键词
semantic role labeling; shallow semantic parsing; clinical natural language processing; domain adaptation; transfer learning; BIOMEDICAL LITERATURE; ANNOTATED CORPUS; INFORMATION; EXTRACTION; KNOWLEDGE; SYSTEM;
D O I
10.1093/jamia/ocu048
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Semantic role labeling (SRL), which extracts a shallow semantic relation representation from different surface textual forms of free text sentences, is important for understanding natural language. Few studies in SRL have been conducted in the medical domain, primarily due to lack of annotated clinical SRL corpora, which are time-consuming and costly to build. The goal of this study is to investigate domain adaptation techniques for clinical SRL leveraging resources built from newswire and biomedical literature to improve performance and save annotation costs. Materials and Methods Multisource Integrated Platform for Answering Clinical Questions (MiPACQ), a manually annotated SRL clinical corpus, was used as the target domain dataset. PropBank and NomBank from newswire and BioProp from biomedical literature were used as source domain datasets. Three state-of-the-art domain adaptation algorithms were employed: instance pruning, transfer self-training, and feature augmentation. The SRL performance using different domain adaptation algorithms was evaluated by using 10-fold cross-validation on the MiPACQ corpus. Learning curves for the different methods were generated to assess the effect of sample size. Results and Conclusion When all three source domain corpora were used, the feature augmentation algorithm achieved statistically significant higher F-measure (83.18%), compared to the baseline with MiPACQ dataset alone (F-measure, 81.53%), indicating that domain adaptation algorithms may improve SRL performance on clinical text. To achieve a comparable performance to the baseline method that used 90% of MiPACQ training samples, the feature augmentation algorithm required < 50% of training samples in MiPACQ, demonstrating that annotation costs of clinical SRL can be reduced significantly by leveraging existing SRL resources from other domains.
引用
收藏
页码:967 / 979
页数:13
相关论文
共 50 条
  • [1] Domain adaptation for semantic role labeling in the biomedical domain
    Dahlmeier, Daniel
    Ng, Hwee Tou
    BIOINFORMATICS, 2010, 26 (08) : 1098 - 1104
  • [2] Domain-Adaptation Technique for Semantic Role Labeling with Structural Learning
    Lim, Soojong
    Lee, Changki
    Ryu, Pum-Mo
    Kim, Hyunki
    Park, Sang Kyu
    Ra, Dongyul
    ETRI JOURNAL, 2014, 36 (03) : 429 - 438
  • [3] Domain Adaptation for Text Categorization by Feature Labeling
    Kadar, Cristina
    Iria, Jose
    ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 424 - +
  • [4] Text Rewriting Improves Semantic Role Labeling
    Woodsend, Kristian
    Lapata, Mirella
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2014, 51 : 133 - 164
  • [5] An Improved Semantic Role Labeling for Myanmar Text
    Zin Mar Kyu
    Naw Lay Wah
    International Journal of Networked and Distributed Computing, 2019, 7 : 51 - 58
  • [6] Domain Adaptation in Semantic Role Labeling Using a Neural Language Model and Linguistic Resources
    Quynh Thi Ngoc Do
    Bethard, Steven
    Moens, Marie-Francine
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1812 - 1823
  • [7] Text Watermarking Algorithm Based on Semantic Role Labeling
    Chen, Jianping
    Yang, Fangxing
    Ma, Haiying
    Lu, Qiuru
    2016 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING, DATA MINING, AND WIRELESS COMMUNICATIONS (DIPDMWC), 2016, : 117 - 120
  • [8] Semantic Role Labeling Approach for Evaluation of Text Coherence
    Haggag, Mohamed H.
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2013, 3 (03) : 59 - 77
  • [9] Out-of-domain FrameNet Semantic Role Labeling
    Hartmann, Silvana
    Kuznetsov, Ilia
    Martin, Teresa
    Gurevych, Iryna
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 471 - 482
  • [10] Towards Open-Domain Semantic Role Labeling
    Croce, Danilo
    Giannone, Cristina
    Annesi, Paolo
    Basili, Roberto
    ACL 2010: 48TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2010, : 237 - 246