Gaussian binning: a new kernel-based method for processing NMR spectroscopic data for metabolomics

被引:51
|
作者
Anderson, Paul E. [1 ]
Reo, Nicholas V. [2 ]
DelRaso, Nicholas J. [3 ]
Doom, Travis E. [1 ]
Raymer, Michael L. [1 ]
机构
[1] Wright State Univ, Dept Comp Sci & Engn, Dayton, OH 45435 USA
[2] Wright State Univ, Boonshoft Sch Med, Dept Biochem & Mol Biol, Dayton, OH 45429 USA
[3] USAF, Wright Patterson AFB, Human Performance Wing 711, Wright Patterson AFB, OH 45433 USA
关键词
Gaussian; binning; pattern recognition; quantification; nuclear magnetic resonance;
D O I
10.1007/s11306-008-0117-3
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
In many metabolomics studies, NMR spectra are divided into bins of fixed width. This spectral quantification technique, known as uniform binning, is used to reduce the number of variables for pattern recognition techniques and to mitigate effects from variations in peak positions; however, shifts in peaks near the boundaries can cause dramatic quantitative changes in adjacent bins due to non-overlapping boundaries. Here we describe a new Gaussian binning method that incorporates overlapping bins to minimize these effects. A Gaussian kernel weights the signal contribution relative to distance from bin center, and the overlap between bins is controlled by the kernel standard deviation. Sensitivity to peak shift was assessed for a series of test spectra where the offset frequency was incremented in 0.5 Hz steps. For a 4 Hz shift within a bin width of 24 Hz, the error for uniform binning increased by 150%, while the error for Gaussian binning increased by 50%. Further, using a urinary metabolomics data set (from a toxicity study) and principal component analysis (PCA), we showed that the information content in the quantified features was equivalent for Gaussian and uniform binning methods. The separation between groups in the PCA scores plot, measured by the J(2) quality metric, is as good or better for Gaussian binning versus uniform binning. The Gaussian method is shown to be robust in regards to peak shift, while still retaining the information needed by classification and multivariate statistical techniques for NMR-metabolomics data.
引用
收藏
页码:261 / 272
页数:12
相关论文
共 50 条
  • [21] Kernel-based home range method for data with irregular sampling intervals
    Katajisto, J
    Moilanen, A
    ECOLOGICAL MODELLING, 2006, 194 (04) : 405 - 413
  • [22] Dynamic adaptive binning: an improved quantification technique for NMR spectroscopic data
    Paul E. Anderson
    Deirdre A. Mahle
    Travis E. Doom
    Nicholas V. Reo
    Nicholas J. DelRaso
    Michael L. Raymer
    Metabolomics, 2011, 7 : 179 - 190
  • [23] A New Kernel-based Classification Algorithm
    Zhou, Xiaofei
    Jiang, Wenhan
    Tian, Yingjie
    Zhang, Peng
    Nie, Guangli
    Shi, Yong
    2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 1094 - +
  • [24] Dynamic adaptive binning: an improved quantification technique for NMR spectroscopic data
    Anderson, Paul E.
    Mahle, Deirdre A.
    Doom, Travis E.
    Reo, Nicholas V.
    DelRaso, Nicholas J.
    Raymer, Michael L.
    METABOLOMICS, 2011, 7 (02) : 179 - 190
  • [25] Support Kernel Classification: A New Kernel-Based Approach
    Bchir, Ouiem
    Ben Ismail, Mohamed M.
    Algarni, Sara
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (10) : 17 - 26
  • [26] A new kernel-based approach to system identification with quantized output data
    Bottegal, Giulio
    Hjalmarsson, Hakan
    Pillonetto, Gianluigi
    AUTOMATICA, 2017, 85 : 145 - 152
  • [27] Adaptive Gaussian Kernel-Based Incremental Scheme for Outlier Detection
    Zhang, Panpan
    Wang, Tao
    Cao, Hui
    Lu, Siliang
    ELECTRONICS, 2023, 12 (22)
  • [28] A Gaussian Kernel-Based Approach for Modeling Vehicle Headway Distributions
    Zhang, Guohui
    Wang, Yinhai
    TRANSPORTATION SCIENCE, 2014, 48 (02) : 206 - 216
  • [29] Gaussian Kernel-Based Fuzzy Clustering with Automatic Bandwidth Computation
    de Carvalho, Francisco de A. T.
    Santana, Lucas V. C.
    Ferreira, Marcelo R. P.
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT I, 2018, 11139 : 685 - 694
  • [30] A novel power-based approach to Gaussian kernel selection in the kernel-based association test
    Zhan, Xiang
    Ghosh, Debashis
    STATISTICAL METHODOLOGY, 2016, 33 : 180 - 191