iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites

被引:202
|
作者
Wu, Zhi-Cheng [1 ,2 ]
Xiao, Xuan [1 ,2 ]
Chou, Kuo-Chen [2 ]
机构
[1] Jing De Zhen Ceram Inst, Dept Comp, Jing De Zhen 333046, Peoples R China
[2] Gordon Life Sci Inst, San Diego, CA 92130 USA
基金
中国国家自然科学基金;
关键词
AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINES; GRAM-NEGATIVE BACTERIA; TERMINAL TARGETING SEQUENCES; IMPROVED HYBRID APPROACH; APOPTOSIS PROTEINS; LOCATION PREDICTION; GENE ONTOLOGY; SORTING SIGNALS; REPRESENTATION;
D O I
10.1039/c1mb05232b
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Predicting protein subcellular localization is a challenging problem, particularly when query proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing methods can only be used to deal with the single-location proteins. Actually, multiple-location proteins should not be ignored because they usually bear some special functions worthy of our notice. By introducing the "multi-labeled learning" approach, a new predictor, called iLoc-Plant, has been developed that can be used to deal with the systems containing both single-and multiple-location plant proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Plant on a benchmark dataset of plant proteins classified into the following 12 location sites: (1) cell membrane, (2) cell wall, (3) chloroplast, (4) cytoplasm, (5) endoplasmic reticulum, (6) extracellular, (7) Golgi apparatus, (8) mitochondrion, (9) nucleus, (10) peroxisome, (11) plastid, and (12) vacuole, where some proteins belong to two or three locations but none has >= 25% pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Plant was 71%, which is remarkably higher than those achieved by any existing predictors that also have the capacity to deal with such a stringent and complicated plant protein system. As a user-friendly web-server, iLoc-Plant is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Plant or http://www.jci-bioinfo.cn/iLoc-Plant. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematic equations presented in this paper for its integrity. It is anticipated that iLoc-Plant may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, Systems Biology, and Drug Development.
引用
收藏
页码:3287 / 3297
页数:11
相关论文
共 50 条
  • [41] Predicting plant protein subcellular multi-localization CrossMark by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning (vol 310, pg 80, 2012)
    Mei, Suyu
    JOURNAL OF THEORETICAL BIOLOGY, 2013, 338 : 111 - 111
  • [42] mPLR-Loc: An adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction
    Wan, Shibiao
    Mak, Man-Wai
    Kung, Sun-Yuan
    ANALYTICAL BIOCHEMISTRY, 2015, 473 : 14 - 27
  • [43] PSO-LocBact: A Consensus Method for Optimizing Multiple Classifier Results for Predicting the Subcellular Localization of Bacterial Proteins
    Lertampaiporn, Supatcha
    Nuannimnoi, Sirapop
    Vorapreeda, Tayvich
    Chokesajjawatee, Nipa
    Visessanguan, Wonnop
    Thammarongtham, Chinae
    BIOMED RESEARCH INTERNATIONAL, 2019, 2019
  • [44] Predicting the function of rice proteins through Multi-instance Multi-label Learning based on multiple features fusion
    Liu, Jing
    Tang, Xinghua
    Cui, Shuanglong
    Guan, Xiao
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
  • [45] pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC
    Cheng, Xiang
    Xiao, Xuan
    Chou, Kuo-Chen
    GENOMICS, 2018, 110 (01) : 50 - 58
  • [46] Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
    Wang, Xiao
    Zhang, Jun
    Li, Guo-Zheng
    BMC BIOINFORMATICS, 2015, 16
  • [47] Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble
    Xiao Wang
    Jun Zhang
    Guo-Zheng Li
    BMC Bioinformatics, 16
  • [48] iMulti-HumPhos: a multi-label classifier for identifying human phosphorylated proteins using multiple kernel learning based support vector machines
    Hasan, Md Al Mehedi
    Ahmad, Shamim
    Molla, Md Khademul Islam
    MOLECULAR BIOSYSTEMS, 2017, 13 (08) : 1608 - 1618
  • [49] pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset
    Xiao, Xuan
    Cheng, Xiang
    Chen, Genqiang
    Mao, Qi
    Chou, Kuo-Chen
    MEDICINAL CHEMISTRY, 2019, 15 (05) : 496 - 509
  • [50] Predicting Viral Protein Subcellular Localization with Chou's Pseudo Amino Acid Composition and Imbalance-Weighted Multi-Label K-Nearest Neighbor Algorithm
    Cao, Jun-Zhe
    Liu, Wen-Qi
    Gu, Hong
    PROTEIN AND PEPTIDE LETTERS, 2012, 19 (11): : 1163 - 1169