A semi-supervised learning framework for biomedical event extraction based on hidden topics

被引:34
|
作者
Zhou, Deyu [1 ]
Zhong, Dayou [1 ]
机构
[1] Southeast Univ, Minist Educ, Key Lab Comp Network & Informat Integrat, Sch Comp Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
基金
美国国家科学基金会;
关键词
Semi-supervised learning; Biomedical event extraction; Latent Dirichlet allocation; K nearest neighbor;
D O I
10.1016/j.artmed.2015.03.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objectives: Scientists have devoted decades of efforts to understanding the interaction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of this information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatically acquiring knowledge of molecular events in research articles, has attracted community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests. Methods and material: In this paper, a semi-supervised learning framework based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. Results: Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach. Conclusion: The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely described by hidden topics and structures of the sentences. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:51 / 58
页数:8
相关论文
共 50 条
  • [21] A Flexible Generative Framework for Graph-based Semi-supervised Learning
    Ma, Jiaqi
    Tang, Weijing
    Zhu, Ji
    Mei, Qiaozhu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [22] A Graph Based Subspace Semi-supervised Learning Framework for Dimensionality Reduction
    Yang, Wuyi
    Zhang, Shuwu
    Liang, Wei
    [J]. COMPUTER VISION - ECCV 2008, PT II, PROCEEDINGS, 2008, 5303 : 664 - 677
  • [23] Semi-supervised Clustering Framework Based on Active Learning for Real Data
    Odate, Ryosuke
    Shinjo, Hiroshi
    Suzuki, Yasufumi
    Motobayashi, Masahiro
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2018, 2018, 11004 : 184 - 193
  • [24] Semi-supervised learning method based on distance metric loss framework
    Liu, Ban-Teng
    Ye, Zan-Ting
    Qin, Hai-Long
    Wang, Ke
    Zheng, Qi-Hang
    Wang, Zhang-Quan
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (04): : 744 - 752
  • [25] A framework for semi-supervised learning based on subjective and objective clustering criteria
    Halkidi, M
    Gunopulos, D
    Kumar, N
    Vazirgiannis, M
    Domeniconi, C
    [J]. FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2005, : 637 - 640
  • [26] An Efficient Framework Based on Semi-Supervised Learning for Transformer Fault Diagnosis
    Yang, Jiarong
    Yang, Dingkun
    Bao, Jinshan
    Zhang, Jing
    He, Yu
    Yan, Rujing
    Zhang, Ying
    Hu, Kelin
    [J]. IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2024, 19 (03) : 362 - 372
  • [27] Semi-supervised learning of the hidden vector state model for protein-protein interactions extraction
    Zhou, Deyu
    He, Yulan
    Kwoh, Chee Keong
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, VOLS 1 AND 2, 2007, : 674 - 680
  • [28] Semi-supervised learning of causal relations in biomedical scientific discourse
    Mihaila, Claudiu
    Ananiadou, Sophia
    [J]. BIOMEDICAL ENGINEERING ONLINE, 2014, 13
  • [29] Semi-supervised learning of causal relations in biomedical scientific discourse
    Claudiu Mihăilă
    Sophia Ananiadou
    [J]. BioMedical Engineering OnLine, 13
  • [30] Self-supervised Correction Learning for Semi-supervised Biomedical Image Segmentation
    Zhang, Ruifei
    Liu, Sishuo
    Yu, Yizhou
    Li, Guanbin
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 134 - 144