Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data

被引:79
|
作者
Enot, David P. [1 ]
Lin, Wanchang [1 ]
Beckmann, Manfred [1 ]
Parker, David [1 ]
Overy, David P. [1 ]
Draper, John [1 ]
机构
[1] Aberystwyth Univ, Inst Biol Sci, Aberystwyth SY23 3DA, Dyfed, Wales
基金
英国生物技术与生命科学研究理事会;
关键词
D O I
10.1038/nprot.2007.511
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Metabolome analysis by flow injection electrospray mass spectrometry (FIE-MS) fingerprinting generates measurements relating to large numbers of m/z signals. Such data sets often exhibit high variance with a paucity of replicates, thus providing a challenge for data mining. We describe data preprocessing and modeling methods that have proved reliable in projects involving samples from a range of organisms. The protocols interact with software resources specifically for metabolomics provided in a Web-accessible data analysis package FIEmspro (http://users.aber.ac.uk/jhd) written in the R environment and requiring a moderate knowledge of R command-line usage. Specific emphasis is placed on describing the outcome of modeling experiments using FIE-MS data that require further preprocessing to improve quality. The salient features of both poor and robust (i.e., highly generalizable) multivariate models are outlined together with advice on validating classifiers and avoiding false discovery when seeking explanatory variables.
引用
收藏
页码:446 / 470
页数:25
相关论文
共 50 条
  • [1] Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data
    David P Enot
    Wanchang Lin
    Manfred Beckmann
    David Parker
    David P Overy
    John Draper
    Nature Protocols, 2008, 3 : 446 - 470
  • [2] High-throughput, nontargeted metabolite fingerprinting using nominal mass flow injection electrospray mass spectrometry
    Beckmann, Manfred
    Parker, David
    Enot, David P.
    Duval, Emilie
    Draper, John
    NATURE PROTOCOLS, 2008, 3 (03) : 486 - 504
  • [3] High-throughput, nontargeted metabolite fingerprinting using nominal mass flow injection electrospray mass spectrometry
    Manfred Beckmann
    David Parker
    David P Enot
    Emilie Duval
    John Draper
    Nature Protocols, 2008, 3 : 486 - 504
  • [4] Feature selection as a preprocessing step for classification in gene expression data
    Borges, Helyane Bronoski
    Nievola, Julio Cesar
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 157 - +
  • [5] A comparative investigation of modern feature selection and classification approaches for the analysis of mass spectrometry data
    Gromski, Piotr S.
    Xu, Yun
    Correa, Elon
    Ellis, David I.
    Turner, Michael L.
    Goodacre, Royston
    ANALYTICA CHIMICA ACTA, 2014, 829 : 1 - 8
  • [6] Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data
    Zhang, XG
    Lu, X
    Shi, Q
    Xu, XQ
    Leung, HCE
    Harris, LN
    D Iglehart, J
    Miron, A
    Liu, JS
    Wong, WH
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [7] Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data
    Xuegong Zhang
    Xin Lu
    Qian Shi
    Xiu-qin Xu
    Hon-chiu E Leung
    Lyndsay N Harris
    James D Iglehart
    Alexander Miron
    Jun S Liu
    Wing H Wong
    BMC Bioinformatics, 7
  • [8] proFIA: a data preprocessing workflow for flow injection analysis coupled to high-resolution mass spectrometry
    Delabriere, Alexis
    Hohenester, Ulli M.
    Colsch, Benoit
    Junot, Christophe
    Fenaille, Francoise
    Thevenot, Etienne A.
    BIOINFORMATICS, 2017, 33 (23) : 3767 - 3775
  • [9] Analysis of saccharides in beer samples by flow injection with electrospray mass spectrometry
    Mauri, P
    Minoggio, M
    Simonetti, P
    Gardana, C
    Pietta, P
    RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2002, 16 (08) : 743 - 748
  • [10] Feature selection and nearest centroid classification for protein mass spectrometry
    Levner, I
    BMC BIOINFORMATICS, 2005, 6 (1)