Simulated LC-MS Data Set for Assessing the Metabolomics Data Processing Pipeline Implemented into MVAPACK

被引:0
|
作者
Jurich, Christopher P. [1 ]
Jeppesen, Micah J. [1 ,2 ]
Sakallioglu, Isin T. [1 ]
Leite, Aline De Lima [1 ,2 ]
Yesselman, Joseph D. [1 ,2 ]
Powers, Robert [1 ,2 ]
机构
[1] Univ Nebraska Lincoln, Dept Chem, Lincoln, NE 68588 USA
[2] Univ Nebraska Lincoln, Nebraska Ctr Integrated Biomol Commun, Lincoln, NE 68588 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
D-CYCLOSERINE; NMR; STRATEGIES; TOOL;
D O I
10.1021/acs.analchem.3c04979
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Metabolomics commonly relies on using one-dimensional (1D) H-1 NMR spectroscopy or liquid chromatography-mass spectrometry (LC-MS) to derive scientific insights from large collections of biological samples. NMR and MS approaches to metabolomics require, among other issues, a data processing pipeline. Quantitative assessment of the performance of these software platforms is challenged by a lack of standardized data sets with "known" outcomes. To resolve this issue, we created a novel simulated LC-MS data set with known peak locations and intensities, defined metabolite differences between groups (i.e., fold change > 2, coefficient of variation <= 25%), and different amounts of added Gaussian noise (0, 5, or 10%) and missing features (0, 10, or 20%). This data set was developed to improve benchmarking of existing LC-MS metabolomics software and to validate the updated version of our MVAPACK software, which added gas chromatography-MS and LC-MS functionality to its existing 1D and two-dimensional NMR data processing capabilities. We also included two experimental LC-MS data sets acquired from a standard mixture andMycobacterium smegmatiscell lysates since a simulated data set alone may not capture all the unique characteristics and variability of real spectra needed to assess software performance properly. Our simulated and experimental LC-MS data sets were processed with the MS-DIAL and XCMSOnline software packages and our MVAPACK toolkit to showcase the utility of our data sets to benchmark MVAPACK against community standards. Our results demonstrate the enhanced objectivity and clarity of software assessment that can be achieved when both simulated and experimental data are employed since distinctly different software performances were observed with the simulated and experimental LC-MS data sets. We also demonstrate that the performance of MVAPACK is equivalent to or exceeds existing LC-MS software programs while providing a single platform for processing and analyzing both NMR and MS data sets.
引用
收藏
页码:12943 / 12956
页数:14
相关论文
共 50 条
  • [31] A Conversation on Data Mining Strategies in LC-MS Untargeted Metabolomics: Pre-Processing and Pre-Treatment Steps
    Tugizimana, Fidele
    Steenkamp, Paul A.
    Piater, Lizelle A.
    Dubery, Ian A.
    METABOLITES, 2016, 6 (04):
  • [32] Visualization of LC-MS/MS proteomics data in MaxQuant
    Tyanova, Stefka
    Temu, Tikira
    Carlson, Arthur
    Sinitcyn, Pavel
    Mann, Matthias
    Cox, Juergen
    PROTEOMICS, 2015, 15 (08) : 1453 - 1456
  • [33] The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS
    Geromanos, Scott J.
    Vissers, Johannes P. C.
    Silva, Jeffrey C.
    Dorschel, Craig A.
    Li, Guo-Zhong
    Gorenstein, Marc V.
    Bateman, Robert H.
    Langridge, James I.
    PROTEOMICS, 2009, 9 (06) : 1683 - 1695
  • [34] Model-driven data curation pipeline for LC-MS-based untargeted metabolomics
    Riquelme, Gabriel
    Bortolotto, Emmanuel Ezequiel
    Dombald, Matias
    Eugenia, Maria
    METABOLOMICS, 2023, 19 (03)
  • [35] Model-driven data curation pipeline for LC–MS-based untargeted metabolomics
    Gabriel Riquelme
    Emmanuel Ezequiel Bortolotto
    Matías Dombald
    María Eugenia Monge
    Metabolomics, 19
  • [36] Optimization of Synovial Fluid Collection and Processing for NMR Metabolomics and LC-MS/MS Proteomics
    Anderson, James R.
    Phelan, Marie M.
    Rubio-Martinez, Luis M.
    Fitzgerald, Matthew M.
    Jones, Simon W.
    Clegg, Peter D.
    Peffers, Mandy J.
    JOURNAL OF PROTEOME RESEARCH, 2020, 19 (07) : 2585 - 2597
  • [37] Quality Assessment of LC-MS Metabolomic Data
    Ranjbar, Mohammad R. Nezami
    Wang, Yue
    Ressom, Habtom W.
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, 2011, : 1034 - 1036
  • [38] Identifying potential biomarkers in LC-MS data
    Daszykowski, M.
    Wu, W.
    Nicholls, A. W.
    Ball, R. J.
    Czekaj, T.
    Walczak, B.
    JOURNAL OF CHEMOMETRICS, 2007, 21 (7-9) : 292 - 302
  • [39] Bayesian Alignment Model for LC-MS Data
    Tsai, Tsung-Heng
    Tadesse, Mahlet G.
    Wang, Yue
    Ressom, Habtom W.
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM 2011), 2011, : 261 - 264
  • [40] What can we do to refine the redundant data in LC-MS and GC-MS based metabolomics?
    Chen, Jiaqing
    Xu, Fengguo
    BIOANALYSIS, 2017, 9 (03) : 235 - 238