Predicting and annotating catalytic residues: An information theoretic approach

被引:20
|
作者
Sterner, Beckett
Singh, Rohit
Berger, Bonnie [1 ]
机构
[1] MIT, Dept Math, Cambridge, MA 02139 USA
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA USA
关键词
algorithms; computational molecular biology; protein folding; multiple sequence alignment; information theory; FUNCTIONAL SITES; PROTEIN FUNCTION; ACTIVE-SITES; SIMILARITY ANALYSIS; PATTERN SIMILARITY; CRYSTAL-STRUCTURE; SEQUENCE; IDENTIFICATION; TOOL; ENZYMES;
D O I
10.1089/cmb.2007.0042
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We introduce a computational method to predict and annotate the catalytic residues of a protein using only its sequence information, so that we describe both the residues' sequence locations (prediction) and their specific biochemical roles in the catalyzed reaction (annotation). While knowing the chemistry of an enzyme's catalytic residues is essential to understanding its function, the challenges of prediction and annotation have remained difficult, especially when only the enzyme's sequence and no homologous structures are available. Our sequence-based approach follows the guiding principle that catalytic residues performing the same biochemical function should have similar chemical environments; it detects specific conservation patterns near in sequence to known catalytic residues and accordingly constrains what combination of amino acids can be present near a predicted catalytic residue. We associate with each catalytic residue a short sequence profile and define a Kullback-Leibler (KL) distance measure between these profiles, which, as we show, effectively captures even subtle biochemical variations. We apply the method to the class of glycohydrolase enzymes. This class includes proteins from 96 families with very different sequences and folds, many of which perform important functions. In a cross-validation test, our approach correctly predicts the location of the enzymes' catalytic residues with a sensitivity of 80% at a specificity of 99.4%, and in a separate cross-validation we also correctly annotate the biochemical role of 80% of the catalytic residues. Our results compare favorably to existing methods. Moreover, our method is more broadly applicable because it relies on sequence and not structure information; it may, furthermore, be used in conjunction with structure-based methods.
引用
收藏
页码:1058 / 1073
页数:16
相关论文
共 50 条
  • [1] Predicting the Performance of Recommender Systems: An Information Theoretic Approach
    Bellogin, Alejandro
    Castells, Pablo
    Cantador, Ivan
    ADVANCES IN INFORMATION RETRIEVAL THEORY, 2011, 6931 : 27 - 39
  • [2] Predicting stock market movements using network science: an information theoretic approach
    Kim M.
    Sayama H.
    Applied Network Science, 2 (1)
  • [3] Information theoretic approach to information extraction
    Amati, Giambattista
    FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS, 2006, 4027 : 519 - 529
  • [4] An information theoretic approach to manipulation
    Greferath, M
    Schmidt, SE
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL IV, PROCEEDINGS: INFORMATION SYSTEMS, TECHNOLOGIES AND APPLICATIONS: I, 2004, : 86 - 87
  • [5] An Information Theoretic Approach to Econometrics
    Rossi, Francesca
    ECONOMICA, 2014, 81 (323) : 596 - 597
  • [6] Predicting estuarine sediment metal concentrations and inferred ecological conditions: An information theoretic approach
    Hollister, Jeffrey W.
    August, Peter V.
    Paul, John F.
    Walker, Henry A.
    JOURNAL OF ENVIRONMENTAL QUALITY, 2008, 37 (01) : 234 - 244
  • [7] Predicting problem-solving performance with concept maps: An information-theoretic approach
    Hao, Jin-Xing
    Kwok, Ron Chi-Wai
    Lau, Raymond Yiu-Keung
    Yu, Angela Yan
    DECISION SUPPORT SYSTEMS, 2010, 48 (04) : 613 - 621
  • [8] ResBoost: characterizing and predicting catalytic residues in enzymes
    Ron Alterovitz
    Aaron Arvey
    Sriram Sankararaman
    Carolina Dallett
    Yoav Freund
    Kimmen Sjölander
    BMC Bioinformatics, 10
  • [9] ResBoost: characterizing and predicting catalytic residues in enzymes
    Alterovitz, Ron
    Arvey, Aaron
    Sankararaman, Sriram
    Dallett, Carolina
    Freund, Yoav
    Sjoelander, Kimmen
    BMC BIOINFORMATICS, 2009, 10
  • [10] An information theoretic approach to processing management
    Kreucher, Chris
    Carter, Kevin
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1869 - 1872