Selective of informative metabolites using random forests based on model population analysis

被引:24
|
作者
Huang, Jian-Hua [1 ]
Yan, Jun [1 ]
Wu, Qing-Hua [1 ]
Ferro, Miguel Duarte [1 ]
Yi, Lun-Zhao [1 ]
Lu, Hong-Mei [1 ]
Xu, Qing-Song [2 ]
Liang, Yi-Zeng [1 ]
机构
[1] Cent South Univ, Res Ctr Modernizat Tradit Chinese Med, Changsha 410083, Hunan, Peoples R China
[2] Cent South Univ, Sch Math Sci & Comp Technol, Changsha 410083, Hunan, Peoples R China
关键词
Random forests (RF); Model population analysis (MPA); Informative metabolite; Feature selection; GAS CHROMATOGRAPHY/MASS SPECTROMETRY; OF-BAG ESTIMATION; FATTY-ACID; PLASMA; RATS; METABOLOMICS; STRAINS; OBESITY; GC/MS;
D O I
10.1016/j.talanta.2013.07.070
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
One of the main goals of metabolomics studies is to discover informative metabolites or biomarkers, which may be used to diagnose diseases and to find out pathology. Sophisticated feature selection approaches are required to extract the information hidden in such complex 'omics' data. In this study, it is proposed a new and robust selective method by combining random forests (RF) with model population analysis (MPA), for selecting informative metabolites from three metabolomic datasets. According to the contribution to the classification accuracy, the metabolites were classified into three kinds: informative, no-informative, and interfering metabolites. Based on the proposed method, some informative metabolites were selected for three datasets; further analyses of these metabolites between healthy and diseased groups were then performed, showing by T-test that the P values for all these selected metabolites were lower than 0.05. Moreover, the informative metabolites identified by the current method were demonstrated to be correlated with the clinical outcome under investigation. The source codes of MPA-RF in Matlab can be freely downloaded from http://code.google.com/p/my-research-list/downloads/list (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:549 / 555
页数:7
相关论文
共 50 条
  • [1] Revealing informative metabolites with random variable combination based on model population analysis for metabolomics data
    Yun, Yong-Huan
    Zhang, Jiachao
    Chen, Haiming
    Chen, Wenxue
    Zhong, Qiuping
    Zhang, Weimin
    Chen, Weijun
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 197
  • [2] Recipe for revealing informative metabolites based on model population analysis
    Li, Hong-Dong
    Zeng, Mao-Mao
    Tan, Bin-Bin
    Liang, Yi-Zeng
    Xu, Qing-Song
    Cao, Dong-Sheng
    [J]. METABOLOMICS, 2010, 6 (03) : 353 - 361
  • [3] Recipe for revealing informative metabolites based on model population analysis
    Hong-Dong Li
    Mao-Mao Zeng
    Bin-Bin Tan
    Yi-Zeng Liang
    Qing-Song Xu
    Dong-Sheng Cao
    [J]. Metabolomics, 2010, 6 : 353 - 361
  • [4] Informative metabolites identification by variable importance analysis based on random variable combination
    Yun, Yong-Huan
    Liang, Fu
    Deng, Bai-Chuan
    Lai, Guang-Bi
    Goncalves, Carlos M. Vicente
    Lu, Hong-Mei
    Yan, Jun
    Huang, Xin
    Yi, Lun-Zhao
    Liang, Yi-Zeng
    [J]. METABOLOMICS, 2015, 11 (06) : 1539 - 1551
  • [5] Informative metabolites identification by variable importance analysis based on random variable combination
    Yong-Huan Yun
    Fu Liang
    Bai-Chuan Deng
    Guang-Bi Lai
    Carlos M. Vicente Gonçalves
    Hong-Mei Lu
    Jun Yan
    Xin Huang
    Lun-Zhao Yi
    Yi-Zeng Liang
    [J]. Metabolomics, 2015, 11 : 1539 - 1551
  • [6] Erratum to: Informative metabolites identification by variable importance analysis based on random variable combination
    Yong-Huan Yun
    Liang Fu
    Bai-Chuan Deng
    Guang-Bi Lai
    Carlos M. Vicente Gonçalves
    Hong-Mei Lu
    Jun Yan
    Xin Huang
    Lun-Zhao Yi
    Yi-Zeng Liang
    [J]. Metabolomics, 2016, 12
  • [7] Biogeography-Based Informative Gene Selection and Cancer Classification Using SVM and Random Forests
    Nikumbh, Sarvesh
    Ghosh, Shameek
    Jayaraman, V. K.
    [J]. 2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2012,
  • [8] Analysis of a random forests model
    Biau, Gérard
    [J]. Journal of Machine Learning Research, 2012, 13 : 1063 - 1095
  • [9] Analysis of a Random Forests Model
    Biau, Gerard
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 1063 - 1095
  • [10] Random forests for global sensitivity analysis: A selective review
    Antoniadis, Anestis
    Lambert-Lacroix, Sophie
    Poggi, Jean-Michel
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2021, 206