Predicting Structural Motifs of Glycosaminoglycans using Cryogenic Infrared Spectroscopy and Random Forest

被引:7
|
作者
Riedel, Jerome [1 ,2 ]
Meijer, Gerard [2 ]
von Helden, Gert [2 ]
Lettow, Maike [1 ,2 ]
Gotze, Michael [1 ,2 ]
Miller, Rebecca L. [4 ]
Boons, Geert-Jan [5 ,6 ]
Szekeres, Gergo Peter [1 ,2 ]
Pagel, Kevin [1 ,2 ]
Grabarics, Marko [3 ,7 ]
机构
[1] Free Univ Berlin, Dept Biol Chem & Pharm, D-14195 Berlin, Germany
[2] Fritz Haber Inst Max Planck Gesellschaft, Dept Mol Phys, D-14195 Berlin, Germany
[3] Fritz Haber Inst Max Planck Gesellschaft, Dept Mol Phys, D-14195 Berlin, Germany
[4] Univ Copenhagen, Copenhagen Ctr Glyc, Dept Cellular & Mol Med, DK-2200 Copenhagen, Denmark
[5] Univ Utrecht, Bijvoet Ctr Biomol Res, NL-3584 Utrecht, Netherlands
[6] Univ Georgia, Complex Carbohydrate Res Ctr, Athens, GA 30602 USA
[7] Univ Oxford, Dept Chem, Oxford OX1 3TA, England
基金
欧盟地平线“2020”;
关键词
GAS-PHASE; CATIONIZED LYSINE; PROTON AFFINITY; CLASSIFICATION; SIZE;
D O I
10.1021/jacs.2c12762
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In recent years, glycosaminoglycans (GAGs) have emerged into the focus of biochemical and biomedical research due to their importance in a variety of physiological processes. These molecules show great diversity, which makes their analysis highly challenging. A promising tool for identifying the structural motifs and conformation of shorter GAG chains is cryogenic gas-phase infrared (IR) spectroscopy. In this work, the cryogenic gas-phase IR spectra of mass-selected heparan sulfate (HS) di-, tetra-, and hexasaccharide ions were recorded to extract vibrational features that are characteristic to structural motifs. The data were augmented with chondroitin sulfate (CS) disaccharide spectra to assemble a training library for random forest (RF) classifiers. These were used to discriminate between GAG classes (CS or HS) and different sulfate positions (2-O-, 4-O-, 6-O-, and N-sulfation). With optimized data preprocessing and RF modeling, a prediction accuracy of >97% was achieved for HS tetra-and hexasaccharides based on a training set of only 21 spectra. These results exemplify the importance of combining gas-phase cryogenic IR ion spectroscopy with machine learning to improve the future analytical workflow for GAG sequencing and that of other biomolecules, such as metabolites.
引用
收藏
页码:7859 / 7868
页数:10
相关论文
共 50 条
  • [1] A random forest model for predicting exosomal proteins using evolutionary information and motifs
    Arora, Akanksha
    Patiyal, Sumeet
    Sharma, Neelam
    Devi, Naorem Leimarembi
    Kaur, Dashleen
    Raghava, Gajendra P. S.
    [J]. PROTEOMICS, 2024, 24 (06)
  • [2] Unravelling the structural complexity of glycolipids with cryogenic infrared spectroscopy
    Kirschbaum, Carla
    Greis, Kim
    Mucha, Eike
    Kain, Lisa
    Deng, Shenglou
    Zappe, Andreas
    Gewinner, Sandy
    Schoellkopf, Wieland
    von Helden, Gert
    Meijer, Gerard
    Savage, Paul B.
    Marianski, Mateusz
    Teyton, Luc
    Pagel, Kevin
    [J]. NATURE COMMUNICATIONS, 2021, 12 (01)
  • [3] Unravelling the structural complexity of glycolipids with cryogenic infrared spectroscopy
    Carla Kirschbaum
    Kim Greis
    Eike Mucha
    Lisa Kain
    Shenglou Deng
    Andreas Zappe
    Sandy Gewinner
    Wieland Schöllkopf
    Gert von Helden
    Gerard Meijer
    Paul B. Savage
    Mateusz Marianski
    Luc Teyton
    Kevin Pagel
    [J]. Nature Communications, 12
  • [4] Using Random Forest Algorithm to Predict β-Hairpin Motifs
    Jia, Shao-Chun
    Hu, Xiu-Zhen
    [J]. PROTEIN AND PEPTIDE LETTERS, 2011, 18 (06): : 609 - 617
  • [5] Intramolecular Hydrogen Bonding Motifs in Deprotonated Glycine Peptides by Cryogenic Ion Infrared Spectroscopy
    Marsh, Brett M.
    Duffy, Erin M.
    Soukup, Michael T.
    Zhou, Jia
    Garand, Etienne
    [J]. JOURNAL OF PHYSICAL CHEMISTRY A, 2014, 118 (22): : 3906 - 3912
  • [6] Drugs Identification Using Near-Infrared Spectroscopy Based on Random Forest and CatBoost
    Jiang Ping
    Lu Hao-xiang
    Liu Zhen-bing
    [J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2022, 42 (07) : 2148 - 2155
  • [7] Using near-infrared spectroscopy and a random forest regressor to estimate intracranial pressure
    Relander, Filip A. J.
    Ruesch, Alexander
    Yang, Jason
    Acharya, Deepshikha
    Scammon, Bradley
    Schmitt, Samantha
    Crane, Emily C.
    Smith, Matthew A.
    Kainerstorfer, Jana M.
    [J]. NEUROPHOTONICS, 2022, 9 (04)
  • [8] Predicting bioavailability change of complex chemical mixtures in contaminated soils using visible and near-infrared spectroscopy and random forest regression
    S. Cipullo
    S. Nawar
    A. M. Mouazen
    P. Campo-Moreno
    F. Coulon
    [J]. Scientific Reports, 9
  • [9] Predicting bioavailability change of complex chemical mixtures in contaminated soils using visible and near-infrared spectroscopy and random forest regression
    Cipullo, S.
    Nawar, S.
    Mouazen, A. M.
    Campo-Moreno, P.
    Coulon, F.
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [10] Resolving Sphingolipid Isomers Using Cryogenic Infrared Spectroscopy
    Kirschbaum, Carla
    Saied, Essa M.
    Greis, Kim
    Mucha, Eike
    Gewinner, Sandy
    Schoellkopf, Wieland
    Meijer, Gerard
    von Helden, Gert
    Poad, Berwyck L. J.
    Blanksby, Stephen J.
    Arenz, Christoph
    Pagel, Kevin
    [J]. ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2020, 59 (32) : 13638 - 13642