Discovering novel protein-protein interactions by measuring the protein semantic similarity from the biomedical literature

被引:4
|
作者
Chiang, Jung-Hsien [1 ]
Ju, Jiun-Huang [1 ]
机构
[1] Natl Cheng Kung Univ, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
关键词
Classification; machine learning; protein-protein interaction; text mining; INTERACTION EXTRACTION; DATABASE; INFORMATION; DISTANCE;
D O I
10.1142/S0219720014420086
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein-protein interactions (PPIs) are involved in the majority of biological processes. Identification of PPIs is therefore one of the key aims of biological research. Although there are many databases of PPIs, many other unidentified PPIs could be buried in the biomedical literature. Therefore, automated identification of PPIs from biomedical literature repositories could be used to discover otherwise hidden interactions. Search engines, such as Google, have been successfully applied to measure the relatedness among words. Inspired by such approaches, we propose a novel method to identify PPIs through semantic similarity measures among protein mentions. We define six semantic similarity measures as features based on the page counts retrieved from the MEDLINE database. A machine learning classifier, Random Forest, is trained using the above features. The proposed approach achieve an averaged micro-F of 71.28% and an averaged macro-F of 64.03% over five PPI corpora, an improvement over the results of using only the conventional co-occurrence feature (averaged micro-F of 68.79% and an averaged macro-F of 60.49%). A relation-word reinforcement further improves the averaged micro-F to 71.3% and averaged macro-F to 65.12%. Comparing the results of the current work with other studies on the AIMed corpus (ranging from 77.58% to 85.1% in micro-F, 62.18% to 76.27% in macro-F), we show that the proposed approach achieves micro-F of 81.88% and macro-F of 64.01% without the use of sophisticated feature extraction. Finally, we manually examine the newly discovered PPI pairs based on a literature review, and the results suggest that our approach could extract novel protein-protein interactions.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Mining protein interactions from biomedical literature using semantic similarity
    Schmitt, Charles
    Cox, Steven
    Christopherson, Laura
    Scott, Erick
    Firrincieli, Stephen
    Baker, Nancy
    Tutubalina, Elena
    Tropsha, Alexander
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [2] Discovering patterns to extract protein-protein interactions from the literature: Part II
    Hao, Y
    Zhu, XY
    Huang, ML
    Li, M
    [J]. BIOINFORMATICS, 2005, 21 (15) : 3294 - 3300
  • [3] Novel Protein-Protein Interactions Inferred from Literature Context
    van Haagen, Herman H. H. B. M.
    't Hoen, Peter A. C.
    Bovo, Alessandro Botelho
    de Morree, Antoine
    van Mulligen, Erik M.
    Chichester, Christine
    Kors, Jan A.
    den Dunnen, Johan T.
    van Ommen, Gert-Jan B.
    van der Maarel, Silvere M.
    Kern, Vinicius Medina
    Mons, Barend
    Schuemie, Martijn J.
    [J]. PLOS ONE, 2009, 4 (11):
  • [4] Assessing protein-protein interactions based on the semantic similarity of interacting proteins
    Cui, Guangyu
    Kim, Byungmin
    Alguwaizani, Saud
    Han, Kyungsook
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 13 (01) : 75 - 83
  • [5] Measuring protein-protein interactions
    Lakey, JH
    Raggett, EM
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (01) : 119 - 123
  • [6] Biochemical approaches for discovering protein-protein interactions
    Miernyk, Jan A.
    Thelen, Jay J.
    [J]. PLANT JOURNAL, 2008, 53 (04): : 597 - 609
  • [7] A Hybrid Deep Learning Model for Protein-Protein Interactions Extraction from Biomedical Literature
    Quan, Changqin
    Luo, Zhiwei
    Wang, Song
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (08):
  • [8] Explaining protein-protein interactions with knowledge graph-based semantic similarity
    Sousa, Rita T.
    Silva, Sara
    Pesquita, Catia
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 170
  • [9] Mining physical protein-protein interactions from the literature
    Huang, Minlie
    Ding, Shilin
    Wang, Hongning
    Zhu, Xiaoyan
    [J]. GENOME BIOLOGY, 2008, 9
  • [10] Mining physical protein-protein interactions from the literature
    Huang M.
    Ding S.
    Wang H.
    Zhu X.
    [J]. Genome Biology, 9 (Suppl 2):