Selective of informative metabolites using random forests based on model population analysis

被引:24
|
作者
Huang, Jian-Hua [1 ]
Yan, Jun [1 ]
Wu, Qing-Hua [1 ]
Ferro, Miguel Duarte [1 ]
Yi, Lun-Zhao [1 ]
Lu, Hong-Mei [1 ]
Xu, Qing-Song [2 ]
Liang, Yi-Zeng [1 ]
机构
[1] Cent South Univ, Res Ctr Modernizat Tradit Chinese Med, Changsha 410083, Hunan, Peoples R China
[2] Cent South Univ, Sch Math Sci & Comp Technol, Changsha 410083, Hunan, Peoples R China
关键词
Random forests (RF); Model population analysis (MPA); Informative metabolite; Feature selection; GAS CHROMATOGRAPHY/MASS SPECTROMETRY; OF-BAG ESTIMATION; FATTY-ACID; PLASMA; RATS; METABOLOMICS; STRAINS; OBESITY; GC/MS;
D O I
10.1016/j.talanta.2013.07.070
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
One of the main goals of metabolomics studies is to discover informative metabolites or biomarkers, which may be used to diagnose diseases and to find out pathology. Sophisticated feature selection approaches are required to extract the information hidden in such complex 'omics' data. In this study, it is proposed a new and robust selective method by combining random forests (RF) with model population analysis (MPA), for selecting informative metabolites from three metabolomic datasets. According to the contribution to the classification accuracy, the metabolites were classified into three kinds: informative, no-informative, and interfering metabolites. Based on the proposed method, some informative metabolites were selected for three datasets; further analyses of these metabolites between healthy and diseased groups were then performed, showing by T-test that the P values for all these selected metabolites were lower than 0.05. Moreover, the informative metabolites identified by the current method were demonstrated to be correlated with the clinical outcome under investigation. The source codes of MPA-RF in Matlab can be freely downloaded from http://code.google.com/p/my-research-list/downloads/list (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:549 / 555
页数:7
相关论文
共 50 条
  • [41] Demographic model selection using random forests and the site frequency spectrum
    Smith, Megan L.
    Ruffley, Megan
    Espindola, Anahi
    Tank, David C.
    Sullivan, Jack
    Carstens, Bryan C.
    [J]. MOLECULAR ECOLOGY, 2017, 26 (17) : 4562 - 4573
  • [42] Accounting for dependent informative sampling in model-based finite population inference
    Isabel Molina
    Malay Ghosh
    [J]. TEST, 2021, 30 : 179 - 197
  • [43] Accounting for dependent informative sampling in model-based finite population inference
    Molina, Isabel
    Ghosh, Malay
    [J]. TEST, 2021, 30 (01) : 179 - 197
  • [44] Analysis of the population heterogeneity in Hungary using fifteen forensically informative STR markers
    Egyed, B
    Füredi, S
    Angyal, M
    Balogh, I
    Kalmar, L
    Padar, Z
    [J]. FORENSIC SCIENCE INTERNATIONAL, 2006, 158 (2-3) : 244 - 249
  • [45] Utility based maintenance analysis using a Random Sign censoring model
    Andres Christen, J.
    Ruggeri, Fabrizio
    Villa, Enrique
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2011, 96 (03) : 425 - 431
  • [46] An Analysis of Factors Affecting Agricultural Tractors' Reliability Using Random Survival Forests Based on Warranty Data
    ZHAO, ZHI-LIN
    YU, HENG-JIE
    CHENG, F. A. N. G.
    [J]. IEEE ACCESS, 2022, 10 : 50183 - 50194
  • [47] Evaluating Random Forests for Survival Analysis Using Prediction Error Curves
    Mogensen, Ulla B.
    Ishwaran, Hemant
    Gerds, Thomas A.
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2012, 50 (11): : 1 - 23
  • [48] Addressing Measurement Error in Random Forests Using Quantitative Bias Analysis
    Jiang, Tammy
    Gradus, Jaimie L.
    Lash, Timothy L.
    Fox, Matthew P.
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2021, 190 (09) : 1830 - 1840
  • [49] Indicator bacteria at five swimming beaches - analysis using random forests
    Parkhurst, DF
    Brenner, KP
    Dufour, AP
    Wymer, LJ
    [J]. WATER RESEARCH, 2005, 39 (07) : 1354 - 1360
  • [50] Overlapped Soundtracks Segmentation Using Singular Spectrum Analysis and Random Forests
    Mohammed, Duraid Y.
    Li, Francis F.
    [J]. PROCEEDINGS OF 2017 2ND INTERNATIONAL CONFERENCE ON KNOWLEDGE ENGINEERING AND APPLICATIONS (ICKEA), 2017, : 49 - 54