Streamlining NMR Chemical Shift Predictions for Intrinsically Disordered Proteins: Design of Ensembles with Dimensionality Reduction and Clustering

被引:0
|
作者
Bakker, Michael J. [1 ]
Gaffour, Amina [1 ]
Juhas, Martin [1 ,2 ]
Zapletal, Vojtech [1 ]
Stosek, Jakub [1 ,3 ]
Bratholm, Lars A. [4 ]
Precechtelova, Jana Pavlikova [1 ]
机构
[1] Charles Univ Prague, Fac Pharm Hradec Kralove, Hradec Kralove 50005, Czech Republic
[2] Univ Hradec Kralove, Fac Sci, Dept Chem, Hradec Kralove 50003, Czech Republic
[3] Masaryk Univ, Fac Sci, Dept Chem, Kotlarska 2, Brno 61137, Czech Republic
[4] Univ Bristol, Sch Chem, Bristol BS8 1TS, England
关键词
MOLECULAR-DYNAMICS SIMULATIONS; GAUSSIAN-TYPE BASIS; ORBITAL METHODS; FORCE-FIELD; TYROSINE-HYDROXYLASE; FUNCTIONAL THEORY; BASIS-SETS; BINDING; PHOSPHORYLATION; ACCURACY;
D O I
10.1021/acs.jcim.4c00809
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
By merging advanced dimensionality reduction (DR) and clustering algorithm (CA) techniques, our study advances the sampling procedure for predicting NMR chemical shifts (CS) in intrinsically disordered proteins (IDPs), making a significant leap forward in the field of protein analysis/modeling. We enhance NMR CS sampling by generating clustered ensembles that accurately reflect the different properties and phenomena encapsulated by the IDP trajectories. This investigation critically assessed different rapid CS predictors, both neural network (e.g., Sparta+ and ShiftX2) and database-driven (ProCS-15), and highlighted the need for more advanced quantum calculations and the subsequent need for more tractable-sized conformational ensembles. Although neural network CS predictors outperformed ProCS-15 for all atoms, all tools showed poor agreement with H-N CSs, and the neural network CS predictors were unable to capture the influence of phosphorylated residues, highly relevant for IDPs. This study also addressed the limitations of using direct clustering with collective variables, such as the widespread implementation of the GROMOS algorithm. Clustered ensembles (CEs) produced by this algorithm showed poor performance with chemical shifts compared to sequential ensembles (SEs) of similar size. Instead, we implement a multiscale DR and CA approach and explore the challenges and limitations of applying these algorithms to obtain more robust and tractable CEs. The novel feature of this investigation is the use of solvent-accessible surface area (SASA) as one of the fingerprints for DR alongside previously investigated alpha carbon distance/angles or phi/psi dihedral angles. The ensembles produced with SASA tSNE DR produced CEs better aligned with the experimental CS of between 0.17 and 0.36 r(2) (0.18-0.26 ppm) depending on the system and replicate. Furthermore, this technique produced CEs with better agreement than traditional SEs in 85.7% of all ensemble sizes. This study investigates the quality of ensembles produced based on different input features, comparing latent spaces produced by linear vs nonlinear DR techniques and a novel integrated silhouette score scanning protocol for tSNE DR.
引用
收藏
页码:6542 / 6556
页数:15
相关论文
共 35 条
  • [31] The pH-dependence of amide chemical shift of Asp/Glu reflects its pKa in intrinsically disordered proteins with only local interactions
    Pujato, Mario
    Navarro, Abel
    Versace, Rodney
    Mancusso, Romina
    Ghose, Ranajeet
    Tasayco, Maria Luisa
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2006, 1764 (07): : 1227 - 1233
  • [32] Accurate and cost-effective NMR chemical shift predictions for proteins using a molecules-in-molecules fragmentation-based method
    Chandy, Sruthy K.
    Thapa, Bishnu
    Raghavachari, Krishnan
    PHYSICAL CHEMISTRY CHEMICAL PHYSICS, 2020, 22 (47) : 27781 - 27799
  • [33] 4D Non-uniformly sampled HCBCACON and 1 J(NCα)-selective HCBCANCO experiments for the sequential assignment and chemical shift analysis of intrinsically disordered proteins
    Novacek, Jiri
    Haba, Noam Y.
    Chill, Jordan H.
    Zidek, Lukas
    Sklenar, Vladimir
    JOURNAL OF BIOMOLECULAR NMR, 2012, 53 (02) : 139 - 148
  • [34] 4D Non-uniformly sampled HCBCACON and 1J(NCα)-selective HCBCANCO experiments for the sequential assignment and chemical shift analysis of intrinsically disordered proteins
    Jiří Nováček
    Noam Y. Haba
    Jordan H. Chill
    Lukáš Žídek
    Vladimír Sklenář
    Journal of Biomolecular NMR, 2012, 53 : 139 - 148
  • [35] Automated Fragmentation Polarizable Embedding Density Functional Theory (PE-DFT) Calculations of Nuclear Magnetic Resonance (NMR) Shielding Constants of Proteins with Application to Chemical Shift Predictions
    Steinmann, Casper
    Bratholm, Lars Andersen
    Olsen, Jogvan Magnus Haugaard
    Kongsted, Jacob
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2017, 13 (02) : 525 - 536