Sequence-Based Prediction of Metamorphic Behavior in Proteins

被引:14
|
作者
Chen, Nanhao [1 ]
Das, Madhurima [2 ]
LiWang, Andy [2 ,3 ,4 ,5 ,6 ,7 ]
Wang, Lee-Ping [1 ]
机构
[1] Univ Calif Davis, Dept Chem, Davis, CA 95616 USA
[2] Univ Calif Merced, Sch Nat Sci, Merced, CA USA
[3] Univ Calif Merced, Chem & Chem Biol, Merced, CA USA
[4] Univ Calif Merced, Ctr Cellular & Biomol Machines, Merced, CA USA
[5] Univ Calif San Diego, Ctr Circadian Biol, La Jolla, CA 92093 USA
[6] Univ Calif Merced, Quantitat & Syst Biol, Merced, CA USA
[7] Univ Calif Merced, Hlth Sci Res Inst, Merced, CA USA
关键词
SECONDARY STRUCTURE; STRUCTURAL BASIS; DOMAIN; RECOGNITION; RESOLUTION; KINETICS; ANGLES; DIMER; CRKL;
D O I
10.1016/j.bpj.2020.07.034
中图分类号
Q6 [生物物理学];
学科分类号
071011 ;
摘要
An increasing number of proteins have been demonstrated in recent years to adopt multiple three-dimensional folds with different functions. These metamorphic proteins are characterized by having two or more folds with significant differences in their secondary structure, in which each fold is stabilized by a distinct local environment. So far, similar to 90 metamorphic proteins have been identified in the Protein Databank, but we and others hypothesize that a far greater number of metamorphic proteins remain undiscovered. In this work, we introduce a computational model to predict metamorphic behavior in proteins using only knowledge of the sequence. In this model, secondary structure prediction programs are used to calculate diversity indices, which are measures of uncertainty in predicted secondary structure at each position in the sequence; these are then used to assign protein sequences as likely to be metamorphic versus monomorphic (i.e., having just one fold). We constructed a reference data set to train our classification method, which includes a novel compilation of 136 likely monomorphic proteins and a set of 201 metamorphic protein structures taken from the literature. Our model is able to classify proteins as metamorphic versus monomorphic with a Matthews correlation coefficient of similar to 0.36 and true positive/true negative rates of similar to 65%/80%, suggesting that it is possible to predict metamorphic behavior in proteins using only sequence information.
引用
收藏
页码:1380 / 1390
页数:11
相关论文
共 50 条
  • [1] Sequence-based feature prediction and annotation of proteins
    Agnieszka S Juncker
    Lars J Jensen
    Andrea Pierleoni
    Andreas Bernsel
    Michael L Tress
    Peer Bork
    Gunnar von Heijne
    Alfonso Valencia
    Christos A Ouzounis
    Rita Casadio
    Søren Brunak
    [J]. Genome Biology, 10
  • [2] Sequence-based feature prediction and annotation of proteins
    Juncker, Agnieszka S.
    Jensen, Lars J.
    Pierleoni, Andrea
    Bernsel, Andreas
    Tress, Michael L.
    Bork, Peer
    von Heijne, Gunnar
    Valencia, Alfonso
    Ouzounis, Christos A.
    Casadio, Rita
    Brunak, Soren
    [J]. GENOME BIOLOGY, 2009, 10 (02): : 206
  • [3] Sequence-Based Prediction of Type III Secreted Proteins
    Arnold, Roland
    Brandmaier, Stefan
    Kleine, Frederick
    Tischler, Patrick
    Heinz, Eva
    Behrens, Sebastian
    Niinikoski, Antti
    Mewes, Hans-Werner
    Horn, Matthias
    Rattei, Thomas
    [J]. PLOS PATHOGENS, 2009, 5 (04)
  • [4] ThermoFinder: A sequence-based thermophilic proteins prediction framework
    Yu, Han
    Luo, Xiaozhou
    [J]. INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2024, 270
  • [5] Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins
    Daniele Raimondi
    Gabriele Orlando
    Rita Pancsa
    Taushif Khan
    Wim F. Vranken
    [J]. Scientific Reports, 7
  • [6] Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins
    Raimondi, Daniele
    Orlando, Gabriele
    Pancsa, Rita
    Khan, Taushif
    Vranken, Wim F.
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [7] Sequence-based prediction of protein domains
    Liu, JF
    Rost, B
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (12) : 3522 - 3530
  • [8] Sequence-based prediction of variants’ effects
    Nicole Rusk
    [J]. Nature Methods, 2018, 15 : 571 - 571
  • [9] Sequence-based prediction of pathological mutations
    Ferrer-Costa, C
    Orozco, M
    de la Cruz, X
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 57 (04) : 811 - 819
  • [10] Sequence-Based Prediction of Plant Allergenic Proteins: Machine Learning Classification Approach
    Nedyalkova, Miroslava
    Vasighi, Mahdi
    Azmoon, Amirreza
    Naneva, Ludmila
    Simeonov, Vasil
    [J]. ACS OMEGA, 2023, : 3698 - 3704