The METLIN small molecule dataset for machine learning-based retention time prediction

被引:0
|
作者
Xavier Domingo-Almenara
Carlos Guijas
Elizabeth Billings
J. Rafael Montenegro-Burke
Winnie Uritboonthai
Aries E. Aisporna
Emily Chen
H. Paul Benton
Gary Siuzdak
机构
[1] The Scripps Research Institute,Scripps Center for Metabolomics
[2] The Scripps Research Institute,California Institute for Biomedical Research (Calibr)
[3] The Scripps Research Institute,Department of Integrative Structural and Computational Biology
[4] EURECAT – Technology Centre of Catalonia & Rovira i Virgili University joint unit,Centre for Omic Sciences
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Machine learning has been extensively applied in small molecule analysis to predict a wide range of molecular properties and processes including mass spectrometry fragmentation or chromatographic retention time. However, current approaches for retention time prediction lack sufficient accuracy due to limited available experimental data. Here we introduce the METLIN small molecule retention time (SMRT) dataset, an experimentally acquired reverse-phase chromatography retention time dataset covering up to 80,038 small molecules. To demonstrate the utility of this dataset, we deployed a deep learning model for retention time prediction applied to small molecule annotation. Results showed that in 70%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\%$$\end{document} of the cases, the correct molecular identity was ranked among the top 3 candidates based on their predicted retention time. We anticipate that this dataset will enable the community to apply machine learning or first principles strategies to generate better models for retention time prediction.
引用
收藏
相关论文
共 50 条
  • [1] The METLIN small molecule dataset for machine learning-based retention time prediction
    Domingo-Almenara, Xavier
    Guijas, Carlos
    Billings, Elizabeth
    Montenegro-Burke, J. Rafael
    Uritboonthai, Winnie
    Aisporna, Aries E.
    Chen, Emily
    Benton, H. Paul
    Siuzdak, Gary
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [2] Machine Learning-Based Retention Time Prediction of Trimethylsilyl Derivatives of Metabolites
    de Cripan, Sara M.
    Cereto-Massague, Adria
    Herrero, Pol
    Barcaru, Andrei
    Canela, Nuria
    Domingo-Almenara, Xavier
    [J]. BIOMEDICINES, 2022, 10 (04)
  • [3] Comprehensive and Empirical Evaluation of Machine Learning Algorithms for Small Molecule LC Retention Time Prediction
    Bouwmeester, Robbin
    Martens, Lennart
    Degroeve, Sven
    [J]. ANALYTICAL CHEMISTRY, 2019, 91 (05) : 3694 - 3703
  • [4] MACHINE LEARNING-BASED PERFORMANCE PREDICTION MODEL OPTIMIZATION FOR SOI LDMOS USING ADAPTIVE SMALL SPACE DATASET
    You, Jinwen
    Chen, Jing
    Yao, Qing
    Dai, Yuxuan
    Guo, Yufeng
    [J]. CONFERENCE OF SCIENCE & TECHNOLOGY FOR INTEGRATED CIRCUITS, 2024 CSTIC, 2024,
  • [5] Machine learning-based risk prediction model for cardiovascular disease using a hybrid dataset
    Kanagarathinam, Karthick
    Sankaran, Durairaj
    Manikandan, R.
    [J]. DATA & KNOWLEDGE ENGINEERING, 2022, 140
  • [6] Machine learning-based prediction of transfusion
    Mitterecker, Andreas
    Hofmann, Axel
    Trentino, Kevin M.
    Lloyd, Adam
    Leahy, Michael F.
    Schwarzbauer, Karin
    Tschoellitsch, Thomas
    Boeck, Carl
    Hochreiter, Sepp
    Meier, Jens
    [J]. TRANSFUSION, 2020, 60 (09) : 1977 - 1986
  • [7] MACHINE LEARNING-BASED EARLY MORTALITY PREDICTION AT THE TIME OF ICU ADMISSION
    McManus, Sean
    Almuqati, Reem
    Khatib, Reem
    Khanna, Ashish
    Cywinski, Jacek
    Papay, Francis
    Mathur, Piyush
    [J]. CRITICAL CARE MEDICINE, 2022, 50 (01) : 607 - 607
  • [8] Machine Learning-Based Time Series Prediction at Brazilian Stocks Exchange
    dos Santos Gularte, Ana Paula
    Filho, Danusio Gadelha Guimaraes
    de Oliveira Torres, Gabriel
    da Silva, Thiago Carvalho Nunes
    Curtis, Vitor Venceslau
    [J]. COMPUTATIONAL ECONOMICS, 2023,
  • [9] Machine learning-based radiotherapy time prediction and treatment scheduling management
    Xie, Lisiqi
    Xu, Dan
    He, Kangjian
    Tian, Xin
    [J]. JOURNAL OF APPLIED CLINICAL MEDICAL PHYSICS, 2023, 24 (09):
  • [10] Bayesian machine learning-based method for prediction of slope failure time
    Zhang, Jie
    Wang, Zipeng
    Hu, Jinzheng
    Xiao, Shihao
    Shang, Wenyu
    [J]. JOURNAL OF ROCK MECHANICS AND GEOTECHNICAL ENGINEERING, 2022, 14 (04) : 1188 - 1199