TESTING INDEPENDENCE WITH HIGH-DIMENSIONAL CORRELATED SAMPLES

被引:7
|
作者
Chen, Xi [1 ]
Liu, Weidong [2 ,3 ]
机构
[1] NYU, Stern Sch Business, 44 West 4Th St, New York, NY 10012 USA
[2] Shanghai Jiao Tong Univ, Dept Math, Inst Nat Sci, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, MOE LSC, Shanghai, Peoples R China
来源
ANNALS OF STATISTICS | 2018年 / 46卷 / 02期
基金
澳大利亚研究理事会;
关键词
Independence test; multiple testing of correlations; false discovery rate; matrix-variate normal; quadratic functional estimation; high-dimensional sample correlation matrix; FALSE DISCOVERY RATE; COVARIANCE-MATRIX; PHASE-TRANSITION; OPTIMAL RATES; DISTRIBUTIONS; CONVERGENCE; COHERENCE; FRAMEWORK; STRENGTH; GENES;
D O I
10.1214/17-AOS1571
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Testing independence among a number of (ultra) high-dimensional random samples is a fundamental and challenging problem. By arranging n identically distributed p-dimensional random vectors into a p x n data matrix, we investigate the problem of testing independence among columns under the matrix-variate normal modeling of data. We propose a computationally simple and tuning-free test statistic, characterize its limiting null distribution, analyze the statistical power and prove its minimax optimality. As an important by-product of the test statistic, a ratio-consistent estimator for the quadratic functional of a covariance matrix from correlated samples is developed. We further study the effect of correlation among samples to an important high-dimensional inference problem-large-scale multiple testing of Pearson's correlation coefficients. Indeed, blindly using classical inference results based on the assumed independence of samples will lead to many false discoveries, which suggests the need for conducting independence testing before applying existing methods. To address the challenge arising from correlation among samples, we propose a "sandwich estimator" of Pearson's correlation coefficient by de-correlating the samples. Based on this approach, the resulting multiple testing procedure asymptotically controls the overall false discovery rate at the nominal level while maintaining good statistical power. Both simulated and real data experiments are carried out to demonstrate the advantages of the proposed methods.
引用
收藏
页码:866 / 894
页数:29
相关论文
共 50 条
  • [1] Testing independence in high-dimensional multivariate normal data
    Najarzadeh, D.
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (14) : 3421 - 3435
  • [2] Penalized Independence Rule for Testing High-Dimensional Hypotheses
    Shen, Yanfeng
    Lin, Zhengyan
    Zhu, Jun
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (13) : 2424 - 2435
  • [3] Hierarchical Testing in the High-Dimensional Setting With Correlated Variables
    Mandozzi, Jacopo
    Buhlmann, Peter
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (513) : 331 - 343
  • [4] A Simple Method for Testing Independence of High-Dimensional Random Vectors
    Jakimauskas, Gintautas
    Radavicius, Marijus
    Susinskas, Jurgis
    [J]. AUSTRIAN JOURNAL OF STATISTICS, 2008, 37 (01) : 101 - 108
  • [5] HIGH-DIMENSIONAL CONSISTENT INDEPENDENCE TESTING WITH MAXIMA OF RANK CORRELATIONS
    Drton, Mathias
    Han, Fang
    Shi, Hongjian
    [J]. ANNALS OF STATISTICS, 2020, 48 (06): : 3206 - 3227
  • [6] Testing Independence Among a Large Number of High-Dimensional Random Vectors
    Pan, Guangming
    Gao, Jiti
    Yang, Yanrong
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (506) : 600 - 612
  • [7] A Sequential Rejection Testing Method for High-Dimensional Regression with Correlated Variables
    Mandozzi, Jacopo
    Buhlmann, Peter
    [J]. INTERNATIONAL JOURNAL OF BIOSTATISTICS, 2016, 12 (01): : 79 - 95
  • [8] Testing for independence of high-dimensional variables: ρV-coefficient based approach
    Hyodo, Masashi
    Nishiyama, Takahiro
    Pavlenko, Tatjana
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 178
  • [9] Clustering of High-Dimensional and Correlated Data
    McLachlan, Geoffrey J.
    Ng, Shu-Kay
    Wang, K.
    [J]. DATA ANALYSIS AND CLASSIFICATION, 2010, : 3 - 11
  • [10] Distance correlation test for high-dimensional independence
    Li, Weiming
    Wang, Qinwen
    Yao, Jianfeng
    [J]. BERNOULLI, 2024, 30 (04) : 3165 - 3192