iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites

被引:202
|
作者
Wu, Zhi-Cheng [1 ,2 ]
Xiao, Xuan [1 ,2 ]
Chou, Kuo-Chen [2 ]
机构
[1] Jing De Zhen Ceram Inst, Dept Comp, Jing De Zhen 333046, Peoples R China
[2] Gordon Life Sci Inst, San Diego, CA 92130 USA
基金
中国国家自然科学基金;
关键词
AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINES; GRAM-NEGATIVE BACTERIA; TERMINAL TARGETING SEQUENCES; IMPROVED HYBRID APPROACH; APOPTOSIS PROTEINS; LOCATION PREDICTION; GENE ONTOLOGY; SORTING SIGNALS; REPRESENTATION;
D O I
10.1039/c1mb05232b
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Predicting protein subcellular localization is a challenging problem, particularly when query proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing methods can only be used to deal with the single-location proteins. Actually, multiple-location proteins should not be ignored because they usually bear some special functions worthy of our notice. By introducing the "multi-labeled learning" approach, a new predictor, called iLoc-Plant, has been developed that can be used to deal with the systems containing both single-and multiple-location plant proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Plant on a benchmark dataset of plant proteins classified into the following 12 location sites: (1) cell membrane, (2) cell wall, (3) chloroplast, (4) cytoplasm, (5) endoplasmic reticulum, (6) extracellular, (7) Golgi apparatus, (8) mitochondrion, (9) nucleus, (10) peroxisome, (11) plastid, and (12) vacuole, where some proteins belong to two or three locations but none has >= 25% pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Plant was 71%, which is remarkably higher than those achieved by any existing predictors that also have the capacity to deal with such a stringent and complicated plant protein system. As a user-friendly web-server, iLoc-Plant is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Plant or http://www.jci-bioinfo.cn/iLoc-Plant. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematic equations presented in this paper for its integrity. It is anticipated that iLoc-Plant may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, Systems Biology, and Drug Development.
引用
收藏
页码:3287 / 3297
页数:11
相关论文
共 50 条
  • [31] iMem-Seq: A Multi-label Learning Classifier for Predicting Membrane Proteins Types
    Xiao, Xuan
    Zou, Hong-Liang
    Lin, Wei-Zhong
    JOURNAL OF MEMBRANE BIOLOGY, 2015, 248 (04): : 745 - 752
  • [32] iMem-Seq: A Multi-label Learning Classifier for Predicting Membrane Proteins Types
    Xuan Xiao
    Hong-Liang Zou
    Wei-Zhong Lin
    The Journal of Membrane Biology, 2015, 248 : 745 - 752
  • [33] Predicting Protein Subcellular Localization with Multi-label using GraphSAGE and Multi-head Attention Mechanism
    Liang, Qianle
    Qiu, Wenjing
    Lin, Weizhong
    PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND INTELLIGENT COMPUTING, BIC 2024, 2024, : 414 - 419
  • [34] PlantBind: an attention-based multi-label neural network for predicting plant transcription factor binding sites
    Yan, Wenkai
    Li, Zutan
    Pian, Cong
    Wu, Yufeng
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
  • [35] MRSLpred-a hybrid approach for predicting multi-label subcellular localization of mRNA at the genome scale
    Choudhury, Shubham
    Bajiya, Nisha
    Patiyal, Sumeet
    Raghava, Gajendra P. S.
    FRONTIERS IN BIOINFORMATICS, 2024, 4
  • [36] Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier
    Zhang, Qi
    Zhang, Yandan
    Li, Shan
    Han, Yu
    Jin, Shuping
    Gu, Haiming
    Yu, Bin
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [37] Predicting the Subcellular Localization of Proteins with Multiple Sites Based on N-terminal Signals
    Qu, Xumi
    Chen, Yuehui
    Qiao, Shanping
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CLOUD COMPUTING COMPANION (ISCC-C), 2014, : 502 - 507
  • [38] Predicting the Subcellular Localization of Multi-site Protein Based on Fusion Feature and Multi-label Deep Forest Model
    Yang, Hongri
    Meng, Qingfang
    Chen, Yuehui
    Zhong, Lianxin
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 334 - 344
  • [39] DMLDA-LocLIFT: Identification of multi-label protein subcellular localization using DMLDA dimensionality reduction and LIFT classifier
    Zhang, Qi
    Li, Shan
    Yu, Bin
    Zhang, Qingmei
    Han, Yu
    Zhang, Yan
    Ma, Qin
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 206
  • [40] Feature Combination Methods for Prediction of Subcellular Locations of Proteins with Both Single and Multiple Sites
    Wang, Luyao
    Wang, Dong
    Chen, Yuehui
    Qiao, Shanping
    Zhao, Yaou
    Cong, Hanhan
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT I, 2016, 9771 : 192 - 201