Non-asymptotic analysis and inference for an outlyingness induced winsorized mean

被引:0
|
作者
Zuo, Yijun [1 ]
机构
[1] Michigan State Univ, Dept Stat & Probabil, E Lansing, MI 48824 USA
关键词
Non-asymptotic analysis; Centrality estimation; Sub-Gaussian performance; Computability; Finite sample breakdown point; MULTIVARIATE LOCATION; DEPTH; COVARIANCE;
D O I
10.1007/s00362-022-01353-5
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Robust estimation of a mean vector, a topic regarded as obsolete in the traditional robust statistics community, has recently surged in machine learning literature in the last decade. The latest focus is on the sub-Gaussian performance and computability of the estimators in a non-asymptotic setting. Numerous traditional robust estimators are computationally intractable, which partly contributes to the renewal of the interest in the robust mean estimation. Robust centrality estimators, however, include the trimmed mean and the sample median. The latter has the best robustness but suffers a low efficiency drawback. Trimmed mean and median of means, achieving sub-Gaussian performance have been proposed and studied in the literature. This article investigates the robustness of leading sub-Gaussian estimators of mean and reveals that none of them can resist greater than 25% contamination in data and consequently introduces an outlyingness induced winsorized mean which has the best possible robustness (can resist up to 50% contamination without breakdown) meanwhile achieving high efficiency. Furthermore, it has a sub-Gaussian performance for uncontaminated samples and a bounded estimation error for contaminated samples at a given confidence level in a finite sample setting. It can be computed in linear time.
引用
收藏
页码:1465 / 1481
页数:17
相关论文
共 50 条
  • [31] A Non-asymptotic Analysis of Non-parametric Temporal-Difference Learning
    Berthier, Eloise
    Kobeissi, Ziad
    Bach, Francis
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [32] Non-asymptotic tests of model performance
    Sylvain Chassang
    Economic Theory, 2009, 41 : 495 - 514
  • [33] Discrepancy behaviour in the non-asymptotic regime
    Schlier, C
    APPLIED NUMERICAL MATHEMATICS, 2004, 50 (02) : 227 - 238
  • [34] Non-asymptotic tests of model performance
    Chassang, Sylvain
    ECONOMIC THEORY, 2009, 41 (03) : 495 - 514
  • [35] Non-asymptotic Coded Slotted ALOHA
    Fereydounian, Mohammad
    Chen, Xingran
    Hassani, Hamed
    Bidokhti, Shirin Saeedi
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 111 - 115
  • [36] Non-Asymptotic Bound on the Performance of k-Anonymity against Inference Attacks<bold> </bold>
    Zhao, Ping
    Jiang, Hongbo
    Wang, Chen
    Huang, Haojun
    IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2018, : 570 - 577
  • [37] Concentration of Measure: Non-Asymptotic Analysis for Uplink MU-MIMO
    Feng, Junjuan
    Ngo, Hien Quoc
    Matthaiou, Michail
    2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2022, : 1353 - 1358
  • [38] Non-Asymptotic Analysis of a UCB-based Top Two Algorithm
    Jourdan, Marc
    Degenne, Remy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [39] Non-Asymptotic and Asymptotic Analyses on Markov Chains in Several Problems
    Hayashi, Masahito
    Watanabe, Shun
    2014 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2014, : 328 - 337
  • [40] Non-asymptotic analysis of ensemble Kalman updates: effective dimension and localization
    Al-Ghattas, Omar
    Sanz-Alonso, Daniel
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2024, 13 (01)