A Novel Numerical Feature Extraction Method for Protein Subcellular Localization

被引:2
|
作者
Chen, Haowen [1 ]
Liao, Bo [1 ]
Cai, Lijun [1 ]
Chen, Xia [2 ]
Liu, Shixiong [1 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Changsha Aeronaut Vocat & Tech Coll, Changsha 410124, Hunan, Peoples R China
关键词
Protein Subcellular Location; Numerical Feature Extraction; Classification; Support Vector Machine; AMINO-ACID-COMPOSITION; FUNCTIONAL CLASS; GENERAL-FORM; PREDICTION; LOCATION; REPRESENTATION; SEQUENCE;
D O I
10.1166/jctn.2013.3259
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Although there are many subcellular localization methods currently, the prediction accuracy of them is not good enough. The main reason may be that when converting sequence information into numerical information, these methods lose important information. In order to get more global and local information in the protein sequences, we propose a novel numerical feature extraction method for representation of protein sequences which includes three parts of features that are amino acid composition, compression tripeptide composition and local frequency domain values. Then, we use support vector machine and the nearest neighbor algorithm to predict subcellular localization of two benchmark data sets and compare the prediction results and evaluation index values with other methods. Comparison results prove that our method can effectively extract information of protein sequence and improves the prediction accuracy of subcellular localization.
引用
收藏
页码:2618 / 2625
页数:8
相关论文
共 50 条
  • [1] Prediction of protein subcellular multisite localization using a new feature extraction method
    Wang, L. Y.
    Wang, D.
    Chen, Y. H.
    [J]. GENETICS AND MOLECULAR RESEARCH, 2016, 15 (03):
  • [2] Feature Extraction Techniques for Protein Subcellular Localization Prediction
    Gao, Qing-Bin
    Jin, Zhi-Chao
    Wu, Cheng
    Sun, Ya-Lin
    He, Jia
    He, Xiang
    [J]. CURRENT BIOINFORMATICS, 2009, 4 (02) : 120 - 128
  • [3] A Novel Feature Fusion Method for Predicting Protein Subcellular Localization with Multiple Sites
    Wang, Dong
    Han, Shiyuan
    Qu, Xumi
    Bao, Wenzheng
    Chen, Yuehui
    Fan, Yuling
    Zhou, Jin
    [J]. 2015 INTERNATIONAL CONFERENCE ON INFORMATIVE AND CYBERNETICS FOR COMPUTATIONAL SOCIAL SYSTEMS (ICCSS), 2015, : 15 - 19
  • [4] A comparative study on feature extraction from protein sequences for subcellular localization prediction
    Yang, Wen-Yun
    Lu, Bao-Liang
    Yang, Yang
    [J]. PROCEEDINGS OF THE 2006 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2006, : 201 - +
  • [6] A Novel Integrated Method for Human Multiplex Protein Subcellular Localization Prediction
    Gu, Hong
    Cao, Junzhe
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [7] Feature subset selection for protein subcellular localization prediction
    Institute of Automation, National University of Defense Technology, Changsha 410073, Hunan, China
    [J]. Lect. Notes Comput. Sci., 2006, (433-443):
  • [8] Feature subset selection for protein subcellular localization prediction
    Gao, Qing-Bin
    Wang, Zheng-Zhi
    [J]. COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 433 - 443
  • [9] Prediction of protein subcellular localization using machine learning with novel use of generic feature set
    Upama, Paramita Basak
    Tanny, Nawshin Tabassum
    Akhter, Shahin
    [J]. PROCEEDINGS OF 2020 6TH IEEE INTERNATIONAL WOMEN IN ENGINEERING (WIE) CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE 2020), 2020, : 98 - 101
  • [10] Prediction of protein subcellular localization with a novel method: Sequence-segmented PseAAC
    Zhang, Shao-Wu
    Yang, Hui-Fang
    Li, Qi-Peng
    Cheng, Yong-Mei
    Pan, Quan
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 4024 - 4028