Bootstrapping in a high dimensional but very low-sample size problem

被引:3
|
作者
Song, Juhee [1 ]
Hart, Jeffrey D. [2 ]
机构
[1] Scott & White Mem Hosp & Clin, Dept Biostat, Temple, TX 76508 USA
[2] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
关键词
bootstrap-based test; cluster analysis; FDR (false discovery rate); HDLSS (high-dimensional; low-sample size) data; kernel density estimation; mixture model; FALSE DISCOVERY RATE; EXPRESSION; MIXTURE; CONSISTENCY;
D O I
10.1080/00949650902798129
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This article is concerned with testing multiple hypotheses, one for each of a large number of small data sets. Such data are sometimes referred to as high-dimensional, low-sample size data. Our model assumes that each observation within a randomly selected small data set follows a mixture of C shifted and rescaled versions of an arbitrary density f. A novel kernel density estimation scheme, in conjunction with clustering methods, is applied to estimate f. Bayes information criterion and a new criterion weighted mean of within-cluster variances are used to estimate C, which is the number of mixture components or clusters. These results are applied to the multiple testing problem. The null sampling distribution of each test statistic is determined by f, and hence a bootstrap procedure that resamples from an estimate of f is used to approximate this null distribution.
引用
收藏
页码:825 / 840
页数:16
相关论文
共 50 条
  • [21] Covariance Matrix Reconstruction Using Parsimonious Measurements and Low-sample Support
    Hassanien, Aboulnasr
    Amin, Moeness G.
    [J]. 2020 IEEE RADAR CONFERENCE (RADARCONF20), 2020,
  • [22] ESTIMATING SOIL PARAMETERS AND SAMPLE-SIZE BY BOOTSTRAPPING
    DANE, JH
    REED, RB
    HOPMANS, JW
    [J]. SOIL SCIENCE SOCIETY OF AMERICA JOURNAL, 1986, 50 (02) : 283 - 287
  • [23] Geometry of Goodness-of-Fit Testing in High Dimensional Low Sample Size Modelling
    Marriott, Paul
    Sabolova, Radka
    Van Bever, Germain
    Critchley, Frank
    [J]. GEOMETRIC SCIENCE OF INFORMATION, GSI 2015, 2015, 9389 : 569 - 576
  • [24] Sensitivity analysis approaches to high-dimensional screening problems at low sample size
    Becker, W. E.
    Tarantola, S.
    Deman, G.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2018, 88 (11) : 2089 - 2110
  • [25] Unsupervised classification of high-dimension and low-sample data with variational autoencoder based dimensionality reduction
    Mahmud, Mohammad Sultan
    Fu, Xianghua
    [J]. 2019 IEEE 4TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS (ICARM 2019), 2019, : 498 - 503
  • [26] Low-sample classification in NIDS using the EC-GAN method
    Zekan, Marko
    Tomicic, Igor
    Schatten, Markus
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2022, 28 (12) : 1330 - 1346
  • [27] On feature selection protocols for very low-sample-size data
    Kuncheva, Ludmila I.
    Rodriguez, Juan J.
    [J]. PATTERN RECOGNITION, 2018, 81 : 660 - 673
  • [28] Reproducibility and Sample Size in High-Dimensional Data
    Seo, Won Seok
    Choi, Jeea
    Jeong, Hyeong Chul
    Cho, HyungJun
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2010, 23 (06) : 1067 - 1080
  • [29] Power and sample size estimation in high dimensional biology
    Gadbury, GL
    Page, GP
    Edwards, J
    Kayo, T
    Prolla, TA
    Weindruch, R
    Permana, PA
    Mountz, JD
    Allison, DB
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2004, 13 (04) : 325 - 338
  • [30] Partition clustering of high dimensional low sample size data based on p-values
    von Borries, George
    Wang, Haiyan
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (12) : 3987 - 3998