DNABind: A hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches

被引：52

作者：

Liu, Rong ^{[1
,2
]}

Hu, Jianjun ^{[1
]}

机构：

[1] Univ S Carolina, Dept Comp Sci & Engn, Columbia, SC 29208 USA

[2] Huazhong Agr Univ, Coll Life Sci & Technol, Ctr Bioinformat, Wuhan 430070, Peoples R China

来源：

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS | 2013年 / 81卷 / 11期

基金：

美国国家科学基金会;

关键词：

protein-DNA interaction; DNA-binding residue; machine learning; template; structural analysis; conformational change; PROTEIN-STRUCTURE ALIGNMENT; AMINO-ACID-SEQUENCES; SECONDARY STRUCTURE; WEB SERVER; SITES; CONSERVATION; INFORMATION; EVOLUTIONARY; RECOGNITION; POTENTIALS;

D O I：

10.1002/prot.24330

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

Accurate prediction of DNA-binding residues has become a problem of increasing importance in structural bioinformatics. Here, we presented DNABind, a novel hybrid algorithm for identifying these crucial residues by exploiting the complementarity between machine learning- and template-based methods. Our machine learning-based method was based on the probabilistic combination of a structure-based and a sequence-based predictor, both of which were implemented using support vector machines algorithms. The former included our well-designed structural features, such as solvent accessibility, local geometry, topological features, and relative positions, which can effectively quantify the difference between DNA-binding and nonbinding residues. The latter combined evolutionary conservation features with three other sequence attributes. Our template-based method depended on structural alignment and utilized the template structure from known protein-DNA complexes to infer DNA-binding residues. We showed that the template method had excellent performance when reliable templates were found for the query proteins but tended to be strongly influenced by the template quality as well as the conformational changes upon DNA binding. In contrast, the machine learning approach yielded better performance when high-quality templates were not available (about 1/3 cases in our dataset) or the query protein was subject to intensive transformation changes upon DNA binding. Our extensive experiments indicated that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures. DNABind also significantly outperformed the state-of-art algorithms by around 10% in terms of Matthews's correlation coefficient. The proposed methodology could also have wide application in various protein functional site annotations. DNABind is freely available at http://mleg.cse.sc.edu/DNABind/. Proteins 2013; 81:1885-1899. (c) 2013 Wiley Periodicals, Inc.

引用

页码：1885 / 1899

页数：15

共 50 条

[21] A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach
Cai, Yudong
He, ZhiSong
Shi, Xiaohe
Kong, Xiangying
Gu, Lei
Xie, Lu
[J]. MOLECULES AND CELLS, 2010, 30 (02) : 99 - 105
[22] Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening
Ain, Qurrat Ul
Aleksandrova, Antoniya
Roessler, Florian D.
Ballester, Pedro J.
[J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2015, 5 (06) : 405 - 424
[23] A novel prediction method for protein DNA-binding residues based on neighboring residue correlations
Song, Jiazhi
Liu, Guixia
Jiang, Jingqing
[J]. BIOTECHNOLOGY & BIOTECHNOLOGICAL EQUIPMENT, 2022, 36 (01) : 865 - 877
[24] Sequence-Based Prediction of DNA-Binding Residues in Proteins with Conservation and Correlation Information
Ma, Xin
Guo, Jing
Liu, Hong-De
Xie, Jian-Ming
Sun, Xiao
[J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (06) : 1766 - 1775
[25] Structure-based prediction of DNA-binding proteins by structural alignment and a volume-fraction corrected DFIRE-based energy function
Zhao, Huiying
Yang, Yuedong
Zhou, Yaoqi
[J]. BIOINFORMATICS, 2010, 26 (15) : 1857 - 1863
[26] Structure-based prediction of DNA-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces
Tsuchiya, Y
Kinoshita, K
Nakamura, H
[J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 55 (04) : 885 - 894
[27] Kernel-based machine learning protocol for predicting DNA-binding proteins
Bhardwaj, N
Langlois, RE
Zhao, GJ
Lu, H
[J]. NUCLEIC ACIDS RESEARCH, 2005, 33 (20) : 6486 - 6493
[28] An efficient algorithm for improving structure-based prediction of transcription factor binding sites
Farrel, Alvin
Guo, Jun-tao
[J]. BMC BIOINFORMATICS, 2017, 18
[29] An efficient algorithm for improving structure-based prediction of transcription factor binding sites
Alvin Farrel
Jun-tao Guo
[J]. BMC Bioinformatics, 18
[30] Combining structure-based pharmacophore modeling and machine learning for the identification of novel BTK inhibitors
Sharma, Tanuj
Saralamma, Venu Venkatarame Gowda
Lee, Duk Chul
Imran, Mohammad Azhar
Choi, Jaehyuk
Baig, Mohammad Hassan
Dong, Jae-June
[J]. INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2022, 222 : 239 - 250

← 1 2 3 4 5 →