Feature Selection and Granular SVM Classification for Protein Arginine Methylation Identification

被引:2
|
作者
Ding, Zejin [1 ]
Zhang, Yan-Qing [1 ]
Zheng, Yujun George [2 ]
机构
[1] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30303 USA
[2] Georgia State Univ, Dept Chem, Atlanta, GA 30303 USA
关键词
Protein Methylation; Imbalanced Data Mining; Granular Support Vector Machines (GSVM); Methylation Prediction; Feature Selction; PREDICTION;
D O I
10.1109/ICSMC.2009.5345973
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Protein methylation modification has been discovered for half a century but still far less been studied than other modifications. Computational analysis is recently introduced to discover other unknown methylation sites based on few known ones. To effectively predict possible methylation, sophisticated classification strategy should be well devised. In this paper, we first extracted informative features from methylated fragments in many protein sequences, including the physicochemical properties, secondary structure information, evolutionary profiles, and solvent accessibility of surrounding residues. Then, an efficient feature selection method (mRMR) is applied to eliminate redundant features but keep important ones. Since methylated residues are far less than non-methylated, the collected data is relatively imbalanced. Thus, we propose to use the granular support vector machine (GSVM) which is specially designed for imbalanced classification problems. A 7-fold cross validation shows that our strategy generates comparable predication accuracy with many current methods or even better. Meanwhile, our method provides insights to identify the underlying mechanisms of protein methylation.
引用
收藏
页码:2979 / +
页数:3
相关论文
共 50 条
  • [1] Feature Selection of Protein Structural Classification Using SVM Classifier
    Krajewski, Zbigniew
    Tkacz, Ewaryst
    [J]. BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2013, 33 (01) : 47 - 61
  • [2] Feature Selection for Classification of Hyperspectral Data by SVM
    Pal, Mahesh
    Foody, Giles M.
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2010, 48 (05): : 2297 - 2307
  • [3] Improving enzyme regulatory protein classification by means of SVM-RFE feature selection
    Fernandez-Lozano, Carlos
    Fernandez-Blanco, Enrique
    Dave, Kirtan
    Pedreira, Nieves
    Gestal, Marcos
    Dorado, Julian
    Munteanu, Cristian R.
    [J]. MOLECULAR BIOSYSTEMS, 2014, 10 (05) : 1063 - 1071
  • [4] Combined SVM-based feature selection and classification
    Neumann, J
    Schnörr, C
    Steidl, G
    [J]. MACHINE LEARNING, 2005, 61 (1-3) : 129 - 150
  • [5] Comparison of Feature Selection Approaches based on the SVM Classification
    Li, F. C.
    Chen, F. L.
    Wang, G. E.
    [J]. IEEM: 2008 INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-3, 2008, : 400 - +
  • [6] Combined SVM-Based Feature Selection and Classification
    Julia Neumann
    Christoph Schnörr
    Gabriele Steidl
    [J]. Machine Learning, 2005, 61 : 129 - 150
  • [7] Multi-view SVM Classification with Feature Selection
    Niu, Yuting
    Shang, Yuan
    Tian, Yingjie
    [J]. 7TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT (ITQM 2019): INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT BASED ON ARTIFICIAL INTELLIGENCE, 2019, 162 : 405 - 412
  • [8] An Efficient Selection of HOG Feature for SVM Classification of Vehicle
    Lee, Seung-Hyun
    Bang, MinSuk
    Jung, Kyeong-Hoon
    Yi, Kang
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE), 2015,
  • [9] Feature selection in text classification via SVM and LSI
    Wang, Ziqiang
    Zhang, Dexian
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 1381 - 1386
  • [10] Simultaneous classification and feature selection via LOG SVM and Elastic LOG SVM
    Liu, Jian-wei
    Li, Shuang-Cheng
    Cui, Li-peng
    Luo, Xiong-lin
    [J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 11017 - 11022