Pretreating and normalizing metabolomics data for statistical analysis

被引:24
|
作者
Sun, Jun [1 ]
Xia, Yinglin [2 ]
机构
[1] Univ Illinois, UIC Canc Ctr, Jesse Brown VA Med Ctr Chicago 537, Dept Med, Chicago, IL 60612 USA
[2] Univ Illinois, Dept Med, Div Gastroenterol & Hepatol, Chicago, IL 60612 USA
关键词
Data centering and scaling; Data normalization; Data transformation; Missing values; MS-Based data preprocessing; NMR Data preprocessing; Outliers; Preprocessing/; pretreatment; CHROMATOGRAPHY-MASS SPECTROMETRY; TIME-DOMAIN ALGORITHM; PATTERN-RECOGNITION; H-1-NMR SPECTRA; MISSING VALUES; URINE; CREATININE; IMPUTATION; REDUCTION; METABOANALYST;
D O I
10.1016/j.gendis.2023.04.018
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Metabolomics as a research field and a set of techniques is to study the entire small molecules in biological samples. Metabolomics is emerging as a powerful tool generally for pre-cision medicine. Particularly, integration of microbiome and metabolome has revealed the mechanism and functionality of microbiome in human health and disease. However, metabo-lomics data are very complicated. Preprocessing/pretreating and normalizing procedures on metabolomics data are usually required before statistical analysis. In this review article, we comprehensively review various methods that are used to preprocess and pretreat metabolo-mics data, including MS-based data and NMR-based data preprocessing, dealing with zero and/ or missing values and detecting outliers, data normalization, data centering and scaling, data transformation. We discuss the advantages and limitations of each method. The choice for a suitable preprocessing method is determined by the biological hypothesis, the characteristics of the data set, and the selected statistical data analysis method. We then provide the perspective of their applications in the microbiome and metabolome research. (c) 2023 The Authors. Publishing services by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons. org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Statistical and computational methods for integrating microbiome, host genomics, and metabolomics data
    Deek, Rebecca A.
    Ma, Siyuan
    Lewis, James
    Li, Hongzhe
    ELIFE, 2024, 13
  • [32] Effects of offset normalizing techniques on variability in motion analysis data
    Mullineaux, D.R., 1600, Human Kinetics Publishers Inc. (20):
  • [33] STATISTICAL ANALYSIS OF DATA
    GREENWOO.J
    BIRD STUDY, 1967, 14 (01) : 64 - &
  • [34] Statistical analysis of feature-based molecular networking results from non-targeted metabolomics data
    Shah, Abzer K. Pakkir
    Walter, Axel
    Ottosson, Filip
    Russo, Francesco
    Navarro-Diaz, Marcelo
    Boldt, Judith
    Kalinski, Jarmo-Charles J.
    Kontou, Eftychia Eva
    Elofson, James
    Polyzois, Alexandros
    Gonzalez-Marin, Carolina
    Farrell, Shane
    Aggerbeck, Marie R.
    Pruksatrakul, Thapanee
    Chan, Nathan
    Wang, Yunshu
    Poechhacker, Magdalena
    Brungs, Corinna
    Camara, Beatriz
    Caraballo-Rodriguez, Andres Mauricio
    Cumsille, Andres
    de Oliveira, Fernanda
    Duehrkop, Kai
    El Abiead, Yasin
    Geibel, Christian
    Graves, Lana G.
    Hansen, Martin
    Heuckeroth, Steffen
    Knoblauch, Simon
    Kostenko, Anastasiia
    Kuijpers, Mirte C. M.
    Mildau, Kevin
    Papadopoulos Lambidis, Stilianos
    Gomes, Paulo Wender Portal
    Schramm, Tilman
    Steuer-Lodd, Karoline
    Stincone, Paolo
    Tayyab, Sibgha
    Vitale, Giovanni Andrea
    Wagner, Berenike C.
    Xing, Shipei
    Yazzie, Marquis T.
    Zuffa, Simone
    de Kruijff, Martinus
    Beemelmanns, Christine
    Link, Hannes
    Mayer, Christoph
    van der Hooft, Justin J. J.
    Damiani, Tito
    Pluskal, Tomas
    NATURE PROTOCOLS, 2025, 20 (01) : 92 - 162
  • [35] muma, An R Package for Metabolomics Univariate and Multivariate Statistical Analysis
    Gaude, Edoardo
    Chignola, Francesca
    Spiliotopoulos, Dimitrios
    Spitaleri, Andrea
    Ghitti, Michela
    Garcia-Manteiga, Jose M.
    Mari, Silvia
    Musco, Giovanna
    CURRENT METABOLOMICS, 2013, 1 (02) : 180 - 189
  • [36] Reflections on univariate and multivariate analysis of metabolomics data
    Edoardo Saccenti
    Huub C. J. Hoefsloot
    Age K. Smilde
    Johan A. Westerhuis
    Margriet M. W. B. Hendriks
    Metabolomics, 2014, 10 : 361 - 374
  • [37] WebSpecmine: A Website for Metabolomics Data Analysis and Mining
    Cardoso, Sara
    Afonso, Telma
    Maraschin, Marcelo
    Rocha, Miguel
    METABOLITES, 2019, 9 (10)
  • [38] Cognitive analysis of metabolomics data for systems biology
    Erica L.-W. Majumder
    Elizabeth M. Billings
    H. Paul Benton
    Richard L. Martin
    Amelia Palermo
    Carlos Guijas
    Markus M. Rinschen
    Xavier Domingo-Almenara
    J. Rafael Montenegro-Burke
    Bradley A. Tagtow
    Robert S. Plumb
    Gary Siuzdak
    Nature Protocols, 2021, 16 : 1376 - 1418
  • [39] MetaFIND: A feature analysis tool for metabolomics data
    Kenneth Bryan
    Lorraine Brennan
    Pádraig Cunningham
    BMC Bioinformatics, 9
  • [40] Cognitive analysis of metabolomics data for systems biology
    Majumder, Erica L-W
    Billings, Elizabeth M.
    Benton, H. Paul
    Martin, Richard L.
    Palermo, Amelia
    Guijas, Carlos
    Rinschen, Markus M.
    Domingo-Almenara, Xavier
    Montenegro-Burke, J. Rafael
    Tagtow, Bradley A.
    Plumb, Robert S.
    Siuzdak, Gary
    NATURE PROTOCOLS, 2021, 16 (03) : 1376 - 1418