Sensitivity analysis approaches to high-dimensional screening problems at low sample size

被引:22
|
作者
Becker, W. E. [1 ]
Tarantola, S. [1 ]
Deman, G. [2 ]
机构
[1] European Commiss, Joint Res Ctr, Via E Fermi 2749, I-21027 Ispra, Italy
[2] Univ Neuchatel, Ctr Hydrogeol & Geotherm CHYN, Neuchatel, Switzerland
关键词
Sensitivity analysis; screening; Sobol' indices; elementary effects; Derivative-based global sensitivity measures; G* function; low-discrepancy sequences; MODELS; VARIABLES; INDEXES; DESIGNS; OUTPUT; LINK;
D O I
10.1080/00949655.2018.1450876
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Sensitivity analysis is an essential tool in the development of robust models for engineering, physical sciences, economics and policy-making, but typically requires running the model a large number of times in order to estimate sensitivity measures. While statistical emulators allow sensitivity analysis even on complex models, they only perform well with a moderately low number of model inputs: in higher dimensional problems they tend to require a restrictively high number of model runs unless the model is relatively linear. Therefore, an open question is how to tackle sensitivity problems in higher dimensionalities, at very low sample sizes. This article examines the relative performance of four sampling-based measures which can be used in such high-dimensional nonlinear problems. The measures tested are the Sobol' total sensitivity indices, the absolute mean of elementary effects, a derivative-based global sensitivity measure, and a modified derivative-based measure. Performance is assessed in a screening' context, by assessing the ability of each measure to identify influential and non-influential inputs on a wide variety of test functions at different dimensionalities. The results show that the best-performing measure in the screening context is dependent on the model or function, but derivative-based measures have a significant potential at low sample sizes that is currently not widely recognised.
引用
收藏
页码:2089 / 2110
页数:22
相关论文
共 50 条
  • [1] Significance analysis of high-dimensional, low-sample size partially labeled data
    Lu, Qiyi
    Qiao, Xingye
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2016, 176 : 78 - 94
  • [2] Scale adjustments for classifiers in high-dimensional, low sample size settings
    Chan, Yao-Ban
    Hall, Peter
    [J]. BIOMETRIKA, 2009, 96 (02) : 469 - 478
  • [3] Reproducibility and Sample Size in High-Dimensional Data
    Seo, Won Seok
    Choi, Jeea
    Jeong, Hyeong Chul
    Cho, HyungJun
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2010, 23 (06) : 1067 - 1080
  • [4] Reliability of Cross-Validation for SVMs in High-Dimensional, Low Sample Size Scenarios
    Klement, Sascha
    Mamlouk, Amir Madany
    Martinetz, Thomas
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 41 - 50
  • [5] Sample size requirements for training high-dimensional risk predictors
    Dobbin, Kevin K.
    Song, Xiao
    [J]. BIOSTATISTICS, 2013, 14 (04) : 639 - 652
  • [6] Numerical sensitivity in the analysis of a high-dimensional oscillator
    Inaba, Naohiko
    Sekikawa, Munehisa
    Endo, Tetsuro
    [J]. IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2012, 3 (04): : 508 - 520
  • [7] Separability tests for high-dimensional, low-sample size multivariate repeated measures data
    Simpson, Sean L.
    Edwards, Lloyd J.
    Styner, Martin A.
    Muller, Keith E.
    [J]. JOURNAL OF APPLIED STATISTICS, 2014, 41 (11) : 2450 - 2461
  • [8] Network-based dimensionality reduction of high-dimensional, low-sample-size datasets
    Kosztyan, Zsolt T.
    Kurbucz, Marcell T.
    Katona, Attila I.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 251
  • [9] Multivariate multidistance tests for high-dimensional low sample size case-control studies
    Marozzi, Marco
    [J]. STATISTICS IN MEDICINE, 2015, 34 (09) : 1511 - 1526
  • [10] General power and sample size calculations for high-dimensional genomic data
    van Iterson, Maarten
    van de Wiel, Mark A.
    Boer, Judith M.
    de Menezes, Renee X.
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2013, 12 (04) : 449 - 467