Predicting protein oxidation sites with feature selection and analysis approach

被引:30
|
作者
Niu, Shen [2 ,3 ]
Hu, Le-Le [1 ]
Zheng, Lu-Lu [3 ,6 ]
Huang, Tao [2 ,3 ]
Feng, Kai-Yan [3 ]
Cai, Yu-Dong [1 ,4 ,7 ]
Li, Hai-Peng [5 ]
Li, Yi-Xue [1 ,2 ]
Chou, Kuo-Chen [7 ]
机构
[1] Shanghai Univ, Inst Syst Biol, Shanghai 200444, Peoples R China
[2] Chinese Acad Sci, Shanghai Inst Biol Sci, Key Lab Syst Biol, Shanghai 200031, Peoples R China
[3] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
[4] Fudan Univ, Ctr Computat Syst Biol, Shanghai 200433, Peoples R China
[5] Chinese Acad Sci, Shanghai Inst Biol Sci, CAS MPG Partner Inst Computat Biol, Shanghai 200031, Peoples R China
[6] Huazhong Univ Sci & Technol, Hubei Bioinformat & Mol Imaging Key Lab, Wuhan 430074, Peoples R China
[7] Gordon Life Sci Inst, San Diego, CA 92130 USA
来源
基金
中国国家自然科学基金;
关键词
AMINO-ACID-COMPOSITION; END-PRODUCTS; SUBCELLULAR-LOCALIZATION; CARBONYL GROUPS; BIOMARKERS; STRESS; CLASSIFIER; MECHANISM; PATHOLOGY; RESIDUES;
D O I
10.1080/07391102.2011.672629
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein oxidation is a ubiquitous post-translational modification that plays important roles in various physiological and pathological processes. Owing to the fact that protein oxidation can also take place as an experimental artifact or caused by oxygen in the air during the process of sample collection and analysis, and that it is both time-consuming and expensive to determine the protein oxidation sites purely by biochemical experiments, it would be of great benefit to develop in silico methods for rapidly and effectively identifying protein oxidation sites. In this study, we developed a computational method to address this problem. Our method was based on the nearest neighbor algorithm in which, however, the maximum relevance minimum redundancy and incremental feature selection approaches were incorporated. From the initial 735 features, 16 features were selected as the optimal feature set. Of such 16 optimized features, 10 features were associated with the position-specific scoring matrix conservation scores, three with the amino acid factors, one with the propensity of conservation of residues on protein surface, one with the side chain count of carbon atom deviation from mean, and one with the solvent accessibility. It was observed that our prediction model achieved an overall success rate of 75.82%, indicating that it is quite encouraging and promising for practical applications. Also, the 16 optimal features obtained through this study may provide useful clues and insights for in-depth understanding the action mechanism of protein oxidation.
引用
收藏
页码:650 / 658
页数:9
相关论文
共 50 条
  • [1] Prediction of protein amidation sites by feature selection and analysis
    Weiren Cui
    Shen Niu
    Lulu Zheng
    Lele Hu
    Tao Huang
    Lei Gu
    Kaiyan Feng
    Ning Zhang
    Yudong Cai
    Yixue Li
    [J]. Molecular Genetics and Genomics, 2013, 288 : 391 - 400
  • [2] Prediction of protein amidation sites by feature selection and analysis
    Cui, Weiren
    Niu, Shen
    Zheng, Lulu
    Hu, Lele
    Huang, Tao
    Gu, Lei
    Feng, Kaiyan
    Zhang, Ning
    Cai, Yudong
    Li, Yixue
    [J]. MOLECULAR GENETICS AND GENOMICS, 2013, 288 (09) : 391 - 400
  • [3] A new hybrid approach for feature selection and predicting of protein interaction network in lung cancer
    Haliem, Zeinab Abd El
    Nassef, Mohammad
    Badr, Amr
    Wassif, Khaled T.
    [J]. BIOSCIENCE RESEARCH, 2019, 16 (02): : 1323 - 1336
  • [4] Feature selection algorithms for predicting preeclampsia: A comparative approach
    Carreno, Jose F.
    Qiu, Peng
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 2626 - 2631
  • [5] A Feature Subset Selection Approach For Predicting Smoking Behaviours
    Long TonThat
    Vu Truong Son Dao
    Huynh Tran Minh Tri
    Minh Tuan Le
    [J]. 2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP, 2023, : 145 - 149
  • [6] Prediction of Protein Secondary Structure Using Feature Selection and Analysis Approach
    Yonge Feng
    Hao Lin
    Liaofu Luo
    [J]. Acta Biotheoretica, 2014, 62 : 1 - 14
  • [7] Prediction of Protein Secondary Structure Using Feature Selection and Analysis Approach
    Feng, Yonge
    Lin, Hao
    Luo, Liaofu
    [J]. ACTA BIOTHEORETICA, 2014, 62 (01) : 1 - 14
  • [8] A machine learning approach for predicting methionine oxidation sites
    Aledo, Juan C.
    Canton, Francisco R.
    Veredas, Francisco J.
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [9] A machine learning approach for predicting methionine oxidation sites
    Juan C. Aledo
    Francisco R. Cantón
    Francisco J. Veredas
    [J]. BMC Bioinformatics, 18
  • [10] Combining feature engineering and feature selection to improve the prediction of methionine oxidation sites in proteins
    Veredas, Francisco J.
    Urda, Daniel
    Subirats, Jose L.
    Canton, Francisco R.
    Aledo, Juan C.
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (02): : 323 - 334