HSQC Spectra Simulation and Matching for Molecular Identification

被引:2
|
作者
Priessner, Martin [1 ]
Lewis, Richard J. [2 ]
Johansson, Magnus J. [1 ]
Goodman, Jonathan M. [3 ]
Janet, Jon Paul [4 ]
Tomberg, Anna [1 ]
机构
[1] AstraZeneca, BioPharmaceut R&D, Med Chem Res & Early Dev, Cardiovasc Renal & Metab CVRM, S-43183 Molndal, Sweden
[2] AstraZeneca, Dept Med Chem Res & Early Dev, Resp & Immunol, BioPharmaceut R&D, S-43183 Molndal, Sweden
[3] Univ Cambridge, Ctr Mol Informat, Yusuf Hamied Dept Chem, Cambridge CB2 1EW, England
[4] AstraZeneca, Mol AI, Discovery Sci, R&D, S-43183 Molndal, Sweden
关键词
STRUCTURAL REVISION; NATURAL-PRODUCTS; CHEMICAL-SHIFTS; VALIDATION; PREDICTION; DP4;
D O I
10.1021/acs.jcim.3c01735
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
In the pursuit of improved compound identification and database search tasks, this study explores heteronuclear single quantum coherence (HSQC) spectra simulation and matching methodologies. HSQC spectra serve as unique molecular fingerprints, enabling a valuable balance of data collection time and information richness. We conducted a comprehensive evaluation of the following four HSQC simulation techniques: ACD/Labs (ACD), MestReNova (MNova), Gaussian NMR calculations (DFT), and a graph-based neural network (ML). For the latter two techniques, we developed a reconstruction logic to combine proton and carbon 1D spectra into HSQC spectra. The methodology involved the implementation of three peak-matching strategies (minimum-sum, Euclidean-distance, and Hungarian distance) combined with three padding strategies (zero-padding, peak-truncated, and nearest-neighbor double assignment). We found that coupling these strategies with a robust simulation technique facilitates the accurate identification of correct molecules from similar analogues (regio- and stereoisomers) and allows for fast and accurate large database searches. Furthermore, we demonstrated the efficacy of the best-performing methodology by rectifying the structures of a set of previously misidentified molecules. This research indicates that effective HSQC spectral simulation and matching methodologies significantly facilitate molecular structure elucidation. Furthermore, we offer a Google Colab notebook for researchers to use our methods on their own data (https://github.com/AstraZeneca/hsqc_structure_elucidation.git).
引用
收藏
页码:3180 / 3191
页数:12
相关论文
共 50 条
  • [21] Machine learning molecular dynamics for the simulation of infrared spectra
    Gastegger, Michael
    Behler, Joerg
    Marquetand, Philipp
    CHEMICAL SCIENCE, 2017, 8 (10) : 6924 - 6935
  • [22] The use of molecular spectra simulation for diagnostics of reactive flows
    Passaro, Angelo
    Carinhana, Dermeval, Jr.
    Goncalves, Enizete Aparecida
    da Silva, Marcio Moreira
    Lasmar Guimaraes, Ana Paula
    Abe, Nancy Mieko
    dos Santos, Alberto Monteiro
    JOURNAL OF AEROSPACE TECHNOLOGY AND MANAGEMENT, 2011, 3 (01) : 13 - 20
  • [23] SIMULATION OF MOLECULAR SPECTRA - LINEAR EQUATION SOLVING METHOD
    GOLLAND, RW
    STILLMAN, AE
    SCHWARTZ, RN
    JOURNAL OF CHEMICAL PHYSICS, 1976, 64 (12): : 4878 - 4880
  • [24] Systematic Identification of Matching Molecular Series and Mapping of Screening Hits
    de Leon, Antonio de la Vega
    Hu, Ye
    Bajorath, Juergen
    MOLECULAR INFORMATICS, 2014, 33 (04) : 257 - 263
  • [25] The Identification of Molecular Spectra, 2nd edition
    Hawkins, R. R.
    LIBRARY JOURNAL, 1951, 76 (04) : 334 - 334
  • [26] Identification of AlF molecular lines in sunspot umbral spectra
    Bagare, SP
    Kumar, KB
    Rajamanickam, N
    SOLAR PHYSICS, 2006, 234 (01) : 1 - 20
  • [27] Identification of AlF Molecular Lines in Sunspot Umbral Spectra
    S. P. Bagare
    K. Balachandra Kumar
    N. Rajamanickam
    Solar Physics, 2006, 234 : 1 - 20
  • [28] Matching Cross-linked Peptide Spectra: Only as Good as the Worse Identification
    Trnka, Michael J.
    Baker, Peter R.
    Robinson, Philip J. J.
    Burlingame, A. L.
    Chalkley, Robert J.
    MOLECULAR & CELLULAR PROTEOMICS, 2014, 13 (02) : 420 - 434
  • [29] COMPUTER IDENTIFICATION OF MASS-SPECTRA .6. PROBABILITY BASED MATCHING OF MASS-SPECTRA - RAPID IDENTIFICATION OF SPECIFIC COMPOUNDS IN MIXTURES
    MCLAFFERTY, FW
    HERTEL, RH
    VILLWOCK, RD
    ORGANIC MASS SPECTROMETRY, 1974, 9 (07): : 690 - 702
  • [30] Matching spectra of fullerenes
    Aihara, J
    Babic, D
    Gutman, I
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 1996, (33) : 7 - 16