MSLoc-DT: A new method for predicting the protein subcellular location of multispecies based on decision templates

被引:22
|
作者
Zhang, Shao-Wu [1 ]
Liu, Yan-Fang [1 ]
Yu, Yong [1 ]
Zhang, Ting-He [1 ]
Fan, Xiao-Nan [1 ]
机构
[1] Northwestern Polytech Univ, Coll Automat, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Amino acid index distribution; Decision template; Gene ontology; Multispecies; Subcellular location; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINES; POSITIVE BACTERIAL PROTEINS; MULTIPLE CLASSIFIER FUSION; GENE ONTOLOGY; ENSEMBLE CLASSIFIER; LOCALIZATION PREDICTION; MEMBRANE-PROTEINS; HUM-MPLOC; EUK-MPLOC;
D O I
10.1016/j.ab.2013.12.013
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Revealing the subcellular location of newly discovered protein sequences can bring insight to their function and guide research at the cellular level. The rapidly increasing number of sequences entering the genome databanks has called for the development of automated analysis methods. Currently, most existing methods used to predict protein subcellular locations cover only one, or a very limited number of species. Therefore, it is necessary to develop reliable and effective computational approaches to further improve the performance of protein subcellular prediction and, at the same time, cover more species. The current study reports the development of a novel predictor called MSLoc-DT to predict the protein subcellular locations of human, animal, plant, bacteria, virus, fungi, and archaea by introducing a novel feature extraction approach termed Amino Acid Index Distribution (AAID) and then fusing gene ontology information, sequential evolutionary information, and sequence statistical information through four different modes of pseudo amino acid composition (PseAAC) with a decision template rule. Using the jackknife test, MSLoc-DT can achieve 86.5, 98.3, 90.3, 98.5, 95.9, 98.1, and 99.3% overall accuracy for human, animal, plant, bacteria, virus, fungi, and archaea, respectively, on seven stringent benchmark datasets. Compared with other predictors (e.g., Gpos-PLoc, Gneg-PLoc, Virus-PLoc, Plant-PLoc, Plant-mPLoc, ProLoc-Go, Hum-PLoc, GOASVM) on the gram-positive, gram-negative, virus, plant, eukaiyotic, and human datasets, the new MSLoc-DT predictor is much more effective and robust. Although the MSLoc-DT predictor is designed to predict the single location of proteins, our method can be extended to multiple locations of proteins by introducing multilabel machine learning approaches, such as the support vector machine and deep learning, as substitutes for the K-nearest neighbor (KNN) method. As a user-friendly web server, MSLoc-DT is freely accessible at http://bioinfo.ibp.ac.cn/MSLOC_DT/index.html. Crown Copyright (C) 2013 Published by Elsevier Inc. All rights reserved.
引用
下载
收藏
页码:164 / 171
页数:8
相关论文
共 25 条
  • [1] A complexity-based method for predicting protein subcellular location
    Zheng, Xiaoqi
    Liu, Taigang
    Wang, Jun
    AMINO ACIDS, 2009, 37 (02) : 427 - 433
  • [2] A complexity-based method for predicting protein subcellular location
    Xiaoqi Zheng
    Taigang Liu
    Jun Wang
    Amino Acids, 2009, 37 : 427 - 433
  • [3] Predicting protein subcellular location based on improved quadratic discriminant
    College of Sciences, Inner Mongolia Agricultural University, Hohhot, 010018, China
    Proc. - Int. Conf. Biomed. Eng. Informatics, BMEI, (1989-1992):
  • [4] Predicting Protein Subcellular Location Based on a Novel Sequence Numerical Model
    Chen, Haowen
    Chen, Xia
    Hu, Qingming
    Cao, Zhi
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2015, 12 (01) : 82 - 87
  • [5] TARGET: a new method for predicting protein subcellular localization in eukaryotes
    Guda, C
    Subramaniam, S
    BIOINFORMATICS, 2005, 21 (21) : 3963 - 3969
  • [6] Predicting subcellular location of protein with evolution information and sequence-based deep learning
    Liao, Zhijun
    Pan, Gaofeng
    Sun, Chao
    Tang, Jijun
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 10)
  • [7] Predicting subcellular location of protein with evolution information and sequence-based deep learning
    Zhijun Liao
    Gaofeng Pan
    Chao Sun
    Jijun Tang
    BMC Bioinformatics, 22
  • [8] Predicting subcellular location of protein with evolution information and sequence-based deep learning
    Liao, Zhijun
    Pan, Gaofeng
    Sun, Chao
    Tang, Jijun
    BMC Bioinformatics, 2021, 22
  • [9] A novel method for predicting protein subcellular localization based on pseudo amino acid composition
    Ma, Junwei
    Gu, Hong
    BMB REPORTS, 2010, 43 (10) : 670 - 676
  • [10] A new balanced ensemble classifier for predicting fungi protein subcellular localization based on protein primary structures
    Zhao, Xing-Ming
    Chen, Luonan
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 1410 - 1414