Sequence-Based Prediction of DNA-Binding Residues in Proteins with Conservation and Correlation Information

被引:52
|
作者
Ma, Xin [1 ,2 ]
Guo, Jing [1 ]
Liu, Hong-De [1 ]
Xie, Jian-Ming [1 ]
Sun, Xiao [1 ]
机构
[1] Southeast Univ, State Key Lab Bioelect, Sch Biol Sci & Med Engn, Nanjing, Jiangsu, Peoples R China
[2] Nanjing Audit Univ, Nanjing, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
DNA-binding residues; random forest; physicochemical property; evolutionary information; WEB SERVER; SITES; IDENTIFICATION; EVOLUTIONARY; PARAMETERS; DISCOVERY; TOOL;
D O I
10.1109/TCBB.2012.106
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The recognition of DNA-binding residues in proteins is critical to our understanding of the mechanisms of DNA-protein interactions, gene expression, and for guiding drug design. Therefore, a prediction method DNABR (DNA Binding Residues) is proposed for predicting DNA-binding residues in protein sequences using the random forest (RF) classifier with sequence-based features. Two types of novel sequence features are proposed in this study, which reflect the information about the conservation of physicochemical properties of the amino acids, and the correlation of amino acids between different sequence positions in terms of physicochemical properties. The first type of feature uses the evolutionary information combined with the conservation of physicochemical properties of the amino acids while the second reflects the dependency effect of amino acids with regards to polarity-charge and hydrophobic properties in the protein sequences. Those two features and an orthogonal binary vector which reflect the characteristics of 20 types of amino acids are used to build the DNABR, a model to predict DNA-binding residues in proteins. The DNABR model achieves a value of 0.6586 for Matthew's correlation coefficient (MCC) and 93.04 percent overall accuracy (ACC) with a 68.47 percent sensitivity (SE) and 98.16 percent specificity (SP), respectively. The comparisons with each feature demonstrate that these two novel features contribute most to the improvement in predictive ability. Furthermore, performance comparisons with other approaches clearly show that DNABR has an excellent prediction performance for detecting binding residues in putative DNA-binding protein. The DNABR web-server system is freely available at http://www.cbi.seu.edu.cn/DNABR/.
引用
收藏
页码:1766 / 1775
页数:10
相关论文
共 50 条
  • [31] An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis
    Zou, Chuanxin
    Gong, Jiayu
    Li, Honglin
    BMC BIOINFORMATICS, 2013, 14
  • [32] An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis
    Chuanxin Zou
    Jiayu Gong
    Honglin Li
    BMC Bioinformatics, 14
  • [33] A Review of DNA-binding Proteins Prediction Methods
    Qu, Kaiyang
    Wei, Leyi
    Zou, Quan
    CURRENT BIOINFORMATICS, 2019, 14 (03) : 246 - 254
  • [34] Random Forests for Prediction of DNA-Binding Residues in Protein Sequences Using Evolutionary Information
    Wang, Liangjiang
    FGCN: PROCEEDINGS OF THE 2008 SECOND INTERNATIONAL CONFERENCE ON FUTURE GENERATION COMMUNICATION AND NETWORKING, VOLS 1 AND 2, 2008, : 976 - 981
  • [35] Sequence-Based Prediction of Metamorphic Behavior in Proteins
    Chen, Nanhao
    Das, Madhurima
    LiWang, Andy
    Wang, Lee-Ping
    BIOPHYSICAL JOURNAL, 2020, 119 (07) : 1380 - 1390
  • [36] Sequence-based feature prediction and annotation of proteins
    Juncker, Agnieszka S.
    Jensen, Lars J.
    Pierleoni, Andrea
    Bernsel, Andreas
    Tress, Michael L.
    Bork, Peer
    von Heijne, Gunnar
    Valencia, Alfonso
    Ouzounis, Christos A.
    Casadio, Rita
    Brunak, Soren
    GENOME BIOLOGY, 2009, 10 (02): : 206
  • [37] ANTIGENIC AND STRUCTURAL CONSERVATION OF HERPESVIRUS DNA-BINDING PROTEINS
    LITTLER, E
    YEO, J
    KILLINGTON, RA
    PURIFOY, DJM
    POWELL, KL
    JOURNAL OF GENERAL VIROLOGY, 1981, 56 (OCT): : 409 - 419
  • [38] Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins
    Kuznetsov, Igor B.
    Gou, Zhenkun
    Li, Run
    Hwang, Seungwoo
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 64 (01) : 19 - 27
  • [39] Sequence Dependence of Binding and Exchange of Nonspecific Dna-Binding Proteins
    Graham, John S.
    Johnson, Reid C.
    Marko, John F.
    BIOPHYSICAL JOURNAL, 2011, 100 (03) : 70 - 70
  • [40] StackDPPred: a stacking based prediction of DNA-binding protein from sequence
    Mishra, Avdesh
    Pokhrel, Pujan
    Hoque, Md Tamjidul
    BIOINFORMATICS, 2019, 35 (03) : 433 - 441