A new robust covariance matrix estimation for high-dimensional microbiome data

被引:0
|
作者
Wang, Jiyang [1 ,2 ]
Liang, Wanfeng [3 ]
Li, Lijie [1 ]
Wu, Yue [1 ]
Ma, Xiaoyan [4 ]
机构
[1] Nankai Univ, Sch Stat & Data Sci, Tianjin 300071, Peoples R China
[2] Xinjiang Univ, Coll Math & Syst Sci, Urumqi 830046, Xinjiang, Peoples R China
[3] Dongbei Univ Finance & Econ, Sch Data Sci & Artificial Intelligence, Dalian 116025, Liaoning, Peoples R China
[4] Ningxia Univ, Sch Math & Stat, Yinchuan 750021, Ningxia, Peoples R China
基金
中国国家自然科学基金;
关键词
centred log-ratio; covariance matrix; high dimension; microbiome data; robustness; thresholding; OPTIMAL RATES; COMPOSITIONAL DATA; GUT MICROBIOME; CONVERGENCE; PATTERNS; OBESITY;
D O I
10.1111/anzs.12415
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Microbiome data typically lie in a high-dimensional simplex. One of the key questions in metagenomic analysis is to exploit the covariance structure for this kind of data. In this paper, a framework called approximate-estimate-threshold (AET) is developed for the robust basis covariance estimation for high-dimensional microbiome data. To be specific, we first construct a proxy matrix Gamma$$ \boldsymbol{\Gamma} $$, which is almost indistinguishable from the real basis covariance matrix & sum;$$ \boldsymbol{\Sigma} $$. Then, any estimator Gamma<^>$$ \hat{\boldsymbol{\Gamma}} $$ satisfying some conditions can be used to estimate Gamma$$ \boldsymbol{\Gamma} $$. Finally, we impose a thresholding step on Gamma<^>$$ \hat{\boldsymbol{\Gamma}} $$ to obtain the final estimator & sum;<^>$$ \hat{\boldsymbol{\Sigma}} $$. In particular, this paper applies a Huber-type estimator Gamma<^>$$ \hat{\boldsymbol{\Gamma}} $$, and achieves robustness by only requiring the boundedness of 2+& varepsilon;$$ \epsilon $$ moments for some & varepsilon;is an element of(0,2]$$ \epsilon \in \left(0,2\right] $$. We derive the convergence rate of & sum;<^>$$ \hat{\boldsymbol{\Sigma}} $$ under the spectral norm, and provide theoretical guarantees on support recovery. Extensive simulations and a real example are used to illustrate the empirical performance of our method.
引用
收藏
页码:281 / 295
页数:15
相关论文
共 50 条
  • [1] Robust estimation of a high-dimensional integrated covariance matrix
    Morimoto, Takayuki
    Nagata, Shuichi
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (02) : 1102 - 1112
  • [2] Robust Covariance Matrix Estimation for High-Dimensional Compositional Data with Application to Sales Data Analysis
    Li, Danning
    Srinivasan, Arun
    Chen, Qian
    Xue, Lingzhou
    [J]. JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2023, 41 (04) : 1090 - 1100
  • [3] High-dimensional covariance matrix estimation
    Lam, Clifford
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2020, 12 (02):
  • [4] Robust estimation of high-dimensional covariance and precision matrices
    Avella-Medina, Marco
    Battey, Heather S.
    Fan, Jianqing
    Li, Quefeng
    [J]. BIOMETRIKA, 2018, 105 (02) : 271 - 284
  • [5] Robust Shrinkage Estimation of High-Dimensional Covariance Matrices
    Chen, Yilun
    Wiesel, Ami
    Hero, Alfred O., III
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2011, 59 (09) : 4097 - 4107
  • [6] Faster Algorithms for High-Dimensional Robust Covariance Estimation
    Cheng, Yu
    Diakonikolas, Ilias
    Ge, Rong
    Woodruff, David P.
    [J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [7] High-dimensional covariance matrix estimation with missing observations
    Lounici, Karim
    [J]. BERNOULLI, 2014, 20 (03) : 1029 - 1058
  • [8] Bandwidth Selection for High-Dimensional Covariance Matrix Estimation
    Qiu, Yumou
    Chen, Song Xi
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (511) : 1160 - 1174
  • [9] Sparse covariance matrix estimation in high-dimensional deconvolution
    Belomestny, Denis
    Trabs, Mathias
    Tsybakov, Alexandre B.
    [J]. BERNOULLI, 2019, 25 (03) : 1901 - 1938
  • [10] Estimation of a high-dimensional covariance matrix with the Stein loss
    Tsukuma, Hisayuki
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2016, 148 : 1 - 17