A TWO-SAMPLE TEST FOR HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO GENE-SET TESTING

被引:426
|
作者
Chen, Song Xi [1 ,2 ]
Qin, Ying-Li [1 ]
机构
[1] Iowa State Univ, Dept Stat, Ames, IA 50011 USA
[2] Peking Univ, Guanghua Sch Management, Beijing 100871, Peoples R China
来源
ANNALS OF STATISTICS | 2010年 / 38卷 / 02期
关键词
High dimension; gene-set testing; large p small n; martingale central limit theorem; multiple comparison; FALSE DISCOVERY RATE; MICROARRAY DATA; COVARIANCE-MATRIX; HYPOTHESIS TESTS; NORMALIZATION; CONSISTENCY; CATEGORIES; EXPRESSION; LIMIT; MODEL;
D O I
10.1214/09-AOS716
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical T(2) test does not work for this "large p, small n" situation. The proposed test does not require explicit conditions in the relationship between the data dimension and sample size. This offers much flexibility in analyzing high-dimensional data. An application of the proposed test is in testing significance for sets of genes which we demonstrate in an empirical study on a leukemia data set.
引用
收藏
页码:808 / 835
页数:28
相关论文
共 50 条
  • [21] A high-dimensional spatial rank test for two-sample location problems
    Feng, Long
    Zhang, Xiaoxu
    Liu, Binghui
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 144
  • [22] Two-sample high-dimensional empirical likelihood
    Fang, Jianglin
    Liu, Wanrong
    Lu, Xuewen
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (13) : 6323 - 6335
  • [23] TWO-SAMPLE BEHRENS-FISHER PROBLEM FOR HIGH-DIMENSIONAL DATA
    Feng, Long
    Zou, Changliang
    Wang, Zhaojun
    Zhu, Lixing
    STATISTICA SINICA, 2015, 25 (04) : 1297 - 1312
  • [24] A nonparametric two-sample test applicable to high dimensional data
    Biswas, Munmun
    Ghosh, Anil K.
    JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 123 : 160 - 171
  • [25] Simple and efficient adaptive two-sample tests for high-dimensional data
    Li, Jun
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (19) : 4428 - 4447
  • [26] Power-Enhanced Simultaneous Test of High-Dimensional Mean Vectors and Covariance Matrices with Application to Gene-Set Testing
    Yu, Xiufan
    Li, Danning
    Xue, Lingzhou
    Li, Runze
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (544) : 2548 - 2561
  • [27] Robust two-sample test of high-dimensional mean vectors under dependence
    Wang, Wei
    Lin, Nan
    Tang, Xiang
    JOURNAL OF MULTIVARIATE ANALYSIS, 2019, 169 : 312 - 329
  • [28] DISTRIBUTION AND CORRELATION-FREE TWO-SAMPLE TEST OF HIGH-DIMENSIONAL MEANS
    Xue, Kaijie
    Yao, Fang
    ANNALS OF STATISTICS, 2020, 48 (03): : 1304 - 1328
  • [29] A high-dimensional inverse norm sign test for two-sample location problems
    Huang, Xifen
    Liu, Binghui
    Zhou, Qin
    Feng, Long
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2023, 51 (04): : 1004 - 1033
  • [30] Two-sample inference for high-dimensional Markov networks
    Kim, Byol
    Liu, Song
    Kolar, Mladen
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2021, 83 (05) : 939 - 962