xMSanalyzer: automated pipeline for improved feature detection and downstream analysis of large-scale, non-targeted metabolomics data

被引:274
|
作者
Uppal, Karan [1 ,6 ]
Soltow, Quinlyn A. [2 ]
Strobel, Frederick H. [3 ]
Pittard, W. Stephen [1 ]
Gernert, Kim M. [1 ]
Yu, Tianwei [4 ]
Jones, Dean P. [2 ,5 ]
机构
[1] Emory Univ, Sch Med, BimCore, Atlanta, GA USA
[2] Emory Univ, Dept Med, Div Pulm Allergy & Crit Care, Atlanta, GA 30322 USA
[3] Emory Univ, Mass Spectrometry Ctr, Atlanta, GA 30322 USA
[4] Emory Univ, Rollins Sch Publ Hlth, Dept Biostat & Bioinformat, Atlanta, GA 30322 USA
[5] Emory Univ, Clin Biomarkers Lab, Atlanta, GA 30322 USA
[6] Georgia Inst Technol, Sch Biol, Atlanta, GA 30332 USA
来源
BMC BIOINFORMATICS | 2013年 / 14卷
基金
美国国家卫生研究院;
关键词
OPEN-SOURCE SOFTWARE; MASS; ALIGNMENT; ALGORITHMS; FRAMEWORK; OPENMS; MZMINE; SUITE;
D O I
10.1186/1471-2105-14-15
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Detection of low abundance metabolites is important for de novo mapping of metabolic pathways related to diet, microbiome or environmental exposures. Multiple algorithms are available to extract m/z features from liquid chromatography-mass spectral data in a conservative manner, which tends to preclude detection of low abundance chemicals and chemicals found in small subsets of samples. The present study provides software to enhance such algorithms for feature detection, quality assessment, and annotation. Results: xMSanalyzer is a set of utilities for automated processing of metabolomics data. The utilites can be classified into four main modules to: 1) improve feature detection for replicate analyses by systematic re-extraction with multiple parameter settings and data merger to optimize the balance between sensitivity and reliability, 2) evaluate sample quality and feature consistency, 3) detect feature overlap between datasets, and 4) characterize high-resolution m/z matches to small molecule metabolites and biological pathways using multiple chemical databases. The package was tested with plasma samples and shown to more than double the number of features extracted while improving quantitative reliability of detection. MS/MS analysis of a random subset of peaks that were exclusively detected using xMSanalyzer confirmed that the optimization scheme improves detection of real metabolites. Conclusions: xMSanalyzer is a package of utilities for data extraction, quality control assessment, detection of overlapping and unique metabolites in multiple datasets, and batch annotation of metabolites. The program was designed to integrate with existing packages such as apLCMS and XCMS, but the framework can also be used to enhance data extraction for other LC/MS data software.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Planning-Based Reasoning for Automated Large-Scale Data Analysis
    Riabov, Anton
    Sohrabi, Shirin
    Sow, Daby
    Turaga, Deepak
    Udrea, Octavian
    Vu, Long
    PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING, 2015, : 282 - 290
  • [32] Data Processing Pipeline of Short-Term Depression Detection with Large-Scale Dataset
    Lee, Yonggeon
    Noh, Youngtae
    Lee, Uichin
    2023 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, BIGCOMP, 2023, : 391 - 392
  • [33] Evaluation of an automated pipeline for large-scale EEG spectral analysis: the National Sleep Research Resource
    Mariani, Sara
    Tarokh, Leila
    Djonlagic, Ina
    Cade, Brian E.
    Morrical, Michael G.
    Yaffe, Kristine
    Stone, Katie L.
    Loparo, Kenneth A.
    Purcell, Shaun M.
    Redline, Susan
    Aeschbach, Daniel
    SLEEP MEDICINE, 2018, 47 : 126 - 136
  • [34] Automated Discovery of Product Feature Inferences Within Large-Scale Implicit Social Media Data
    Tuarob, Suppawong
    Lim, Sunghoon
    Tucker, Conrad S.
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2018, 18 (02)
  • [35] Improved Quadrant Analysis for Large-Scale Events Detection in Turbulent Transport
    Wang, Ye
    Wang, Baomin
    Lan, Changxing
    Fang, Renzhi
    Zheng, Baofeng
    Lu, Jieying
    Zheng, Dan
    ATMOSPHERE, 2022, 13 (03)
  • [36] MetaboLink: a web application for streamlined processing and analysis of large-scale untargeted metabolomics data
    Mendes, Ana
    Havelund, Jesper Foged
    Lemvig, Jonas
    Schwammle, Veit
    Faergeman, Nils J.
    BIOINFORMATICS, 2024, 40 (07)
  • [37] An Improved Kernel Principal Component Analysis for Large-Scale Data Set
    Shi, Weiya
    Zhang, Dexian
    ADVANCES IN NEURAL NETWORKS - ISNN 2010, PT 2, PROCEEDINGS, 2010, 6064 : 9 - 16
  • [38] Automated detection of dental artifacts for large-scale radiomic analysis in radiation oncology
    Arrowsmith, Colin
    Reiazi, Reza
    Welch, Mattea L.
    Kazmierski, Michal
    Patel, Tirth
    Rezaie, Aria
    Tadic, Tony
    Bratman, Scott
    Haibe-Kains, Benjamin
    PHYSICS & IMAGING IN RADIATION ONCOLOGY, 2021, 18 : 41 - 47
  • [39] Widely Targeted Metabolomics Based on Large-Scale MS/MS Data for Elucidating Metabolite Accumulation Patterns in Plants
    Sawada, Yuji
    Akiyama, Kenji
    Sakata, Akane
    Kuwahara, Ayuko
    Otsuki, Hitomi
    Sakurai, Tetsuya
    Saito, Kazuki
    Hirai, Masami Yokota
    PLANT AND CELL PHYSIOLOGY, 2009, 50 (01) : 37 - 47
  • [40] Large-scale automated proactive road safety analysis using video data
    St-Aubin, Paul
    Saunier, Nicolas
    Miranda-Moreno, Luis
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 58 : 363 - 379