Multivariate location and scatter matrix estimation under cellwise and casewise contamination

被引:10
|
作者
Leung, Andy [1 ]
Yohai, Victor [2 ]
Zamar, Ruben [1 ]
机构
[1] Univ British Columbia, Dept Stat, 3182-2207 Main Mall, Vancouver, BC V6T 1Z4, Canada
[2] Univ Buenos Aires, Fac Ciencias Exactas & Nat, Dept Matemat, Ciudad Univ,Pabellon 1, RA-1426 Buenos Aires, DF, Argentina
基金
加拿大自然科学与工程研究理事会;
关键词
Multivariate location and scatter; Robust estimation; Cellwise outliers; Componentwise contamination; ROBUST ESTIMATION; OUTLIER DETECTION; HIGH DIMENSION;
D O I
10.1016/j.csda.2017.02.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:59 / 76
页数:18
相关论文
共 50 条
  • [21] Asymptotic Performance of Complex M-Estimators for Multivariate Location and Scatter Estimation
    Meriaux, Bruno
    Ren, Chengfang
    El Korso, Mohammed Nabil
    Breloy, Arnaud
    Forster, Philippe
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (02) : 367 - 371
  • [22] Asymptotics of reweighted estimators of multivariate location and scatter
    Lopuhaä, HP
    [J]. ANNALS OF STATISTICS, 1999, 27 (05): : 1638 - 1665
  • [23] The DetS and DetMM estimators for multivariate location and scatter
    Hubert, Mia
    Rousseeuw, Peter
    Vanpaemel, Dina
    Verdonck, Tim
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 81 : 64 - 75
  • [24] MULTIVARIATE TAU-ESTIMATORS FOR LOCATION AND SCATTER
    LOPUHAA, HP
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1991, 19 (03): : 307 - 321
  • [25] REDESCENDING M-ESTIMATES OF MULTIVARIATE LOCATION AND SCATTER
    KENT, JT
    TYLER, DE
    [J]. ANNALS OF STATISTICS, 1991, 19 (04): : 2102 - 2119
  • [26] ROBUST M-ESTIMATORS OF MULTIVARIATE LOCATION AND SCATTER
    MARONNA, RA
    [J]. ANNALS OF STATISTICS, 1976, 4 (01): : 51 - 67
  • [27] The S-estimator of multivariate location and scatter in Stata
    Verardi, Vincenzo
    McCathie, Alice
    [J]. STATA JOURNAL, 2012, 12 (02): : 299 - 307
  • [28] Estimation of the multivariate normal precision matrix under the entropy loss
    Zhou, X
    Sun, XQ
    Wang, JL
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2001, 53 (04) : 760 - 768
  • [29] Estimation of the Multivariate Normal Precision Matrix under the Entropy Loss
    Xian Zhou
    Xiaoqian Sun
    Jinglong Wang
    [J]. Annals of the Institute of Statistical Mathematics, 2001, 53 : 760 - 768
  • [30] Estimation of a multivariate normal covariance matrix under a certain structure
    Gupta, AK
    Sheena, Y
    [J]. STATISTICS, 2004, 38 (05) : 371 - 379