Bootstrapping in a high dimensional but very low-sample size problem

被引:3
|
作者
Song, Juhee [1 ]
Hart, Jeffrey D. [2 ]
机构
[1] Scott & White Mem Hosp & Clin, Dept Biostat, Temple, TX 76508 USA
[2] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
关键词
bootstrap-based test; cluster analysis; FDR (false discovery rate); HDLSS (high-dimensional; low-sample size) data; kernel density estimation; mixture model; FALSE DISCOVERY RATE; EXPRESSION; MIXTURE; CONSISTENCY;
D O I
10.1080/00949650902798129
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This article is concerned with testing multiple hypotheses, one for each of a large number of small data sets. Such data are sometimes referred to as high-dimensional, low-sample size data. Our model assumes that each observation within a randomly selected small data set follows a mixture of C shifted and rescaled versions of an arbitrary density f. A novel kernel density estimation scheme, in conjunction with clustering methods, is applied to estimate f. Bayes information criterion and a new criterion weighted mean of within-cluster variances are used to estimate C, which is the number of mixture components or clusters. These results are applied to the multiple testing problem. The null sampling distribution of each test statistic is determined by f, and hence a bootstrap procedure that resamples from an estimate of f is used to approximate this null distribution.
引用
收藏
页码:825 / 840
页数:16
相关论文
共 50 条
  • [1] Significance analysis of high-dimensional, low-sample size partially labeled data
    Lu, Qiyi
    Qiao, Xingye
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2016, 176 : 78 - 94
  • [2] Classification for high-dimension low-sample size data
    Shen, Liran
    Er, Meng Joo
    Yin, Qingbo
    [J]. PATTERN RECOGNITION, 2022, 130
  • [3] Classification for high-dimension low-sample size data
    Shen, Liran
    Er, Meng Joo
    Yin, Qingbo
    [J]. PATTERN RECOGNITION, 2022, 130
  • [4] Separability tests for high-dimensional, low-sample size multivariate repeated measures data
    Simpson, Sean L.
    Edwards, Lloyd J.
    Styner, Martin A.
    Muller, Keith E.
    [J]. JOURNAL OF APPLIED STATISTICS, 2014, 41 (11) : 2450 - 2461
  • [5] Some considerations of classification for high dimension low-sample size data
    Zhang, Lingsong
    Lin, Xihong
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2013, 22 (05) : 537 - 550
  • [6] Statistical Significance of Clustering for High-Dimension, Low-Sample Size Data
    Liu, Yufeng
    Hayes, David Neil
    Nobel, Andrew
    Marron, J. S.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (483) : 1281 - 1293
  • [7] Graph convolutional network-based feature selection for high-dimensional and low-sample size data
    Chen, Can
    Weiss, Scott T.
    Liu, Yang-Yu
    [J]. BIOINFORMATICS, 2023, 39 (04)
  • [8] Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data
    Gui, J
    Li, HZ
    [J]. BIOINFORMATICS, 2005, 21 (13) : 3001 - 3008
  • [9] Neuromorphic tuning of feature spaces to overcome the challenge of low-sample high-dimensional data
    Zhou, Qinghua
    Sutton, Oliver J.
    Zhang, Yu-Dong
    Gorban, Alexander N.
    Makarov, Valeri A.
    Tyukin, Ivan Y.
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [10] High-dimension, low-sample size perspectives in constrained statistical inference: The SARSCoV RNA genome in illustration
    Sen, Pranab K.
    Tsai, Ming-Tien
    Jou, Yuh-Shan
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (478) : 686 - 694