Detection of Protein Subcellular Localization based on a Full Syntactic Parser and Semantic Information

被引:2
|
作者
Kim, Mi-Young [1 ]
机构
[1] Sungshin Womens Univ, Sch Engn & Comp Sci, Seoul 136742, South Korea
关键词
D O I
10.1109/FSKD.2008.529
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A protein's subcellular localization is considered an essential part of the description of its associated biomolecular phenomena. As the volume of biomolecular reports has increased, there has been a great deal of research on text mining to detect protein subcellular localization information in documents. It has been argued that linguistic information, especially syntactic information, is useful for identifying the subcellular localizations of proteins of interest. However, previous systems for detecting protein subcellular localization information used only shallow syntactic parsers, and showed poor performance. Thus, there remains a need to use a full syntactic parser and to apply deep linguistic knowledge to the analysis of text for protein subcellular localization information. In addition, we have attempted to use semantic information from the WordNet thesaurus. To improve performance in detecting protein subcellular localization information, this paper proposes a three-step method based on a full syntactic dependency parser and semantic information. In the first step, we construct syntactic dependency paths from each protein to its location candidate. In the second step, we retrieve root information of the syntactic dependency paths. In the final step, we extract syn-semantic patterns of protein subtrees and location subtrees. From the root and subtree nodes, we extract syntactic category and syntactic direction as syntactic information, and synset offset of the WordNet thesaurus as semantic information. According to the root information and syn-semantic patterns of subtrees, we extract (protein, localization) pairs. Even with no biomolecular knowledge, our method shows reasonable performance in experimental results using Medline abstract data. In fact, our proposed method gave an F-measure of 74.53% for training data and 58.90% for test data, significantly outperforming previous methods, by 12-25%.
引用
收藏
页码:407 / 411
页数:5
相关论文
共 50 条
  • [21] DualGCN: Exploring Syntactic and Semantic Information for Aspect-Based Sentiment Analysis
    Li, Ruifan
    Chen, Hao
    Feng, Fangxiang
    Ma, Zhanyu
    Wang, Xiaojie
    Hovy, Eduard
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (06) : 7642 - 7656
  • [22] Improving prediction of protein subcellular localization using evolutionary information and sequence-order information
    Wang, Minghui
    Li, Ao
    Xie, Dan
    Fan, Zhewen
    Jiang, Zhaohui
    Feng, Huanqing
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4434 - 4436
  • [23] Predicting diabetes mellitus genes via protein-protein interaction and protein subcellular localization information
    Tang, Xiwei
    Hu, Xiaohua
    Yang, Xuejun
    Fan, Yetian
    Li, Yongfan
    Hu, Wei
    Liao, Yongzhong
    Zheng, Ming Cai
    Peng, Wei
    Gao, Li
    [J]. BMC GENOMICS, 2016, 17
  • [24] Predicting diabetes mellitus genes via protein-protein interaction and protein subcellular localization information
    Xiwei Tang
    Xiaohua Hu
    Xuejun Yang
    Yetian Fan
    Yongfan Li
    Wei Hu
    Yongzhong Liao
    Ming cai Zheng
    Wei Peng
    Li Gao
    [J]. BMC Genomics, 17
  • [25] Gene ontology based transfer learning for protein subcellular localization
    Mei, Suyu
    Fei, Wang
    Zhou, Shuigeng
    [J]. BMC BIOINFORMATICS, 2011, 12
  • [26] Multisite protein subcellular localization prediction based on entropy density
    Zhao, Qing
    Wang, Dong
    Chen, Yuehui
    Qu, Xumi
    [J]. BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S2003 - S2009
  • [27] Prediction of protein subcellular localization based on primary sequence data
    Özarar, M
    Atalay, V
    Atalay, RÇ
    [J]. PROCEEDINGS OF THE IEEE 12TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, 2004, : 118 - 120
  • [28] Deep Forest-based Prediction of Protein Subcellular Localization
    Zhao, Lingling
    Wang, Junjie
    Nabil, Mahieddine Mohammed
    Zhang, Jun
    [J]. CURRENT GENE THERAPY, 2018, 18 (05) : 268 - 274
  • [29] Prediction of protein subcellular localization based on primary sequence data
    Özarar, M
    Atalay, V
    Atalay, RÇ
    [J]. COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, 2003, 2869 : 611 - 618
  • [30] Gene ontology based transfer learning for protein subcellular localization
    Suyu Mei
    Wang Fei
    Shuigeng Zhou
    [J]. BMC Bioinformatics, 12