Random forests for global sensitivity analysis: A selective review

被引:173
|
作者
Antoniadis, Anestis [1 ,2 ]
Lambert-Lacroix, Sophie [3 ]
Poggi, Jean-Michel [4 ,5 ]
机构
[1] Univ Grenoble, Lab Jean Kuntzmann, F-38041 Grenoble, France
[2] Univ Cape Town, Dept Stat Sci, Rondebosch, South Africa
[3] Univ Grenoble, Lab TIMC IMAG, UMR 5525, F-38041 Grenoble, France
[4] Univ Paris Saclay, Fac Sci Orsay, Lab Math Orsay, Bat 307, F-91405 Orsay, France
[5] Univ Paris, Paris, France
关键词
Random forests; Global sensitivity analysis; VARIABLE IMPORTANCE; GENE SELECTION; REGRESSION; INDEXES; DECOMPOSITION; MODELS;
D O I
10.1016/j.ress.2020.107312
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The understanding of many physical and engineering problems involves running complex computational models. Such models take as input a high number of numerical and physical explanatory variables. The information on these underlying input parameters is often limited or uncertain. It is therefore important, based on the relationships between the input variables and the output, to identify and prioritize the most influential inputs. One may use global sensitivity analysis (GSA) methods which aim at ranking input random variables according to their importance in the output uncertainty, or even quantify the global influence of a particular input on the output. Using sensitivity metrics to ignore less important parameters is a form of dimension reduction in the model's input parameter space. This suggests the use of meta-modeling as a quantitative approach for nonparametric GSA, where the original input/output relation is first approximated using various statistical regression techniques. Subsequently, the main goal of our work is to provide a comprehensive review paper in the domain of sensitivity analysis focusing on some interesting connections between random forests and GSA. The idea is to use a random forests methodology as an efficient non-parametric approach for building meta-models that allow an efficient sensitivity analysis. Apart its easy applicability to regression problems, the random forests approach presents further strong advantages by its ability to implicitly deal with correlation and high dimensional data, to handle interactions between variables and to identify informative inputs using a permutation based RF variable importance index which is easy and fast to compute. We further review an adequate set of tools for quantifying variable importance which are then exploited to reduce the model's dimension enabling otherwise infeasible sensibility analysis studies. Numerical results from several simulations and a data exploration on a real dataset are presented to illustrate the effectiveness of such an approach.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] The parameter sensitivity of random forests
    Huang, Barbara F. F.
    Boutros, Paul C.
    BMC BIOINFORMATICS, 2016, 17
  • [2] The parameter sensitivity of random forests
    Barbara F.F. Huang
    Paul C. Boutros
    BMC Bioinformatics, 17
  • [3] A Random Forests-based sensitivity analysis framework for assisted history matching
    Aulia, Akmal
    Jeong, Daein
    Saaid, Ismail Mohd
    Kania, Dina
    Shuker, Muhannad Taleb
    El-Khatib, Noaman A.
    JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2019, 181
  • [4] Review of global sensitivity analysis of numerical models
    Iooss, Bertrand
    JOURNAL OF THE SFDS, 2011, 152 (01): : 3 - 25
  • [5] A review of global sensitivity analysis for uncertainty structure
    Xiao SiNan
    Lv ZhenZhou
    Wang Wei
    SCIENTIA SINICA-PHYSICA MECHANICA & ASTRONOMICA, 2018, 48 (01)
  • [6] Selective of informative metabolites using random forests based on model population analysis
    Huang, Jian-Hua
    Yan, Jun
    Wu, Qing-Hua
    Ferro, Miguel Duarte
    Yi, Lun-Zhao
    Lu, Hong-Mei
    Xu, Qing-Song
    Liang, Yi-Zeng
    TALANTA, 2013, 117 : 549 - 555
  • [7] Random Forests and Networks Analysis
    Avena, Luca
    Castell, Fabienne
    Gaudilliere, Alexandre
    Melot, Clothilde
    JOURNAL OF STATISTICAL PHYSICS, 2018, 173 (3-4) : 985 - 1027
  • [8] Analysis of a random forests model
    Biau, Gérard
    Journal of Machine Learning Research, 2012, 13 : 1063 - 1095
  • [9] Random Forests and Networks Analysis
    Luca Avena
    Fabienne Castell
    Alexandre Gaudillière
    Clothilde Mélot
    Journal of Statistical Physics, 2018, 173 : 985 - 1027
  • [10] Analysis of a Random Forests Model
    Biau, Gerard
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 1063 - 1095