Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function

被引:73
|
作者
Zhao, Huiying [2 ]
Yang, Yuedong [2 ]
Zhou, Yaoqi [1 ]
机构
[1] Indiana Univ, Sch Med, Ctr Computat Biol & Bioinformat, Indianapolis, IN 46202 USA
[2] Indiana Univ Purdue Univ, Sch Informat, Indianapolis, IN 46202 USA
基金
美国国家卫生研究院;
关键词
SECONDARY STRUCTURES; RNA-BINDING; SEQUENCE; SITES; COMPLEXES; SUPPORT; MODELS; MOTIFS; FORCE;
D O I
10.1093/bioinformatics/btq295
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Template-based prediction of DNA binding proteins requires not only structural similarity between target and template structures but also prediction of binding affinity between the target and DNA to ensure binding. Here, we propose to predict protein-DNA binding affinity by introducing a new volume-fraction correction to a statistical energy function based on a distance-scaled, finite, ideal-gas reference (DFIRE) state. Results: We showed that this energy function together with the structural alignment program TM-align achieves the Matthews correlation coefficient (MCC) of 0.76 with an accuracy of 98%, a precision of 93% and a sensitivity of 64%, for predicting DNA binding proteins in a benchmark of 179 DNA binding proteins and 3797 nonbinding proteins. The MCC value is substantially higher than the best MCC value of 0.69 given by previous methods. Application of this method to 2235 structural genomics targets uncovered 37 as DNA binding proteins, 27 (73%) of which are putatively DNA binding and only 1 protein whose annotated functions do not contain DNA binding, while the remaining proteins have unknown function. The method provides a highly accurate and sensitive technique for structure-based prediction of DNA binding proteins.
引用
收藏
页码:1857 / 1863
页数:7
相关论文
共 50 条
  • [1] Structure based prediction of binding residues on DNA-binding proteins
    Bhardwaj, Nitin
    Langlois, Robert E.
    Hui, Guijun Zhao
    [J]. 2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 2611 - 2614
  • [2] Sequence-based prediction of DNA-binding sites on DNA-binding proteins
    Gou, Z.
    Hwang, S.
    Kuznetsov, B., I
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 1, 2006, : 268 - +
  • [3] Moment-based prediction of DNA-binding proteins
    Ahmad, S
    Sarai, A
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2004, 341 (01) : 65 - 71
  • [4] Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information
    Ahmad, S
    Gromiha, MM
    Sarai, A
    [J]. BIOINFORMATICS, 2004, 20 (04) : 477 - 486
  • [5] Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces
    Tsuchiya, Y
    Kinoshita, K
    Nakamura, H
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 55 (04) : 885 - 894
  • [6] STRUCTURE-BASED DESIGN OF TRANSCRIPTION FACTORS WITH NOVEL DNA-BINDING SPECIFICITIES
    POMERANTZ, JL
    SHARP, PA
    PABO, CO
    [J]. JOURNAL OF CELLULAR BIOCHEMISTRY, 1995, : 382 - 382
  • [7] Structure-based prediction of DNA target sites by regulatory proteins
    Kono, H
    Sarai, A
    [J]. PROTEINS-STRUCTURE FUNCTION AND GENETICS, 1999, 35 (01): : 114 - 131
  • [8] Structure-based prediction of transcription factor binding specificity using an integrative energy function
    Farrel, Alvin
    Murphy, Jonathan
    Guo, Jun-tao
    [J]. BIOINFORMATICS, 2016, 32 (12) : 306 - 313
  • [9] Structure-based de novo prediction of zinc-binding sites in proteins of unknown function
    Zhao, Wei
    Xu, Meng
    Liang, Zhi
    Ding, Bo
    Niu, Liwen
    Liu, Haiyan
    Teng, Maikun
    [J]. BIOINFORMATICS, 2011, 27 (09) : 1262 - 1268
  • [10] DP-Bind: a Web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins
    Hwang, Seungwoo
    Gou, Zhenkun
    Kuznetsov, Igor B.
    [J]. BIOINFORMATICS, 2007, 23 (05) : 634 - 636