A robust proposal of estimation for the sufficient dimension reduction problem

被引:0
|
作者
Bergesio, Andrea [1 ]
Szretter Noste, Maria Eugenia [2 ]
Yohai, Victor J. [2 ,3 ,4 ]
机构
[1] Univ Nacl Litoral, Fac Ingn Quim, Dept Matemat, Santa Fe, Argentina
[2] Univ Buenos Aires, Inst Calculo, Fac Ciencias Exactas & Nat, Buenos Aires, DF, Argentina
[3] Univ Buenos Aires, Fac Ciencias Exactas & Nat, Dept Matemat, Buenos Aires, DF, Argentina
[4] Consejo Nacl Invest Cient & Tecn, Consejo Nacl Invest Cient & Tecn, Buenos Aires, DF, Argentina
关键词
tau-Estimators; Principal fitted components; Multivariate reduced-rank regression; Robustness; SLICED INVERSE REGRESSION; HIGH BREAKDOWN-POINT;
D O I
10.1007/s11749-020-00745-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In nonparametric regression contexts, when the number of covariables is large, we face the curse of dimensionality. One way to deal with this problem when the sample is not large enough is using a reduced number of linear combinations of the explanatory variables that contain most of the information about the response variable. This leads to the so-called sufficient reduction problem. The purpose of this paper is to obtain robust estimators of a sufficient dimension reduction, that is, estimators which are not very much affected by the presence of a small fraction of outliers in the data. One way to derive a sufficient dimension reduction is by means of the principal fitted components (PFC) model. We obtain robust estimations for the parameters of this model and the corresponding sufficient dimension reduction based on a tau-scale (tau-estimators). Strong consistency of these estimators under weak assumptions of the underlying distribution is proven. The tau-estimators for the PFC model are computed using an iterative algorithm. A Monte Carlo study compares the performance of tau-estimators and maximum likelihood estimators. The results show clear advantages for tau-estimators in the presence of outlier contamination and only small loss of efficiency when outliers are absent. A proposal to select the dimension of the reduction space based on cross-validation is given. These estimators are implemented in R language through functions contained in the package tauPFC. As the PFC model is a special case of multivariate reduced-rank regression, our proposal can be applied directly to this model as well.
引用
收藏
页码:758 / 783
页数:26
相关论文
共 50 条
  • [31] Concordance-based estimation approaches for the optimal sufficient dimension reduction score
    Wang, Shao-Hsuan
    Chiang, Chin-Tsang
    SCANDINAVIAN JOURNAL OF STATISTICS, 2020, 47 (03) : 662 - 689
  • [32] Sufficient Dimension Reduction via Direct Estimation of the Gradients of Logarithmic Conditional Densities
    Sasaki, Hiroaki
    Tangkaratt, Voot
    Niu, Gang
    Sugiyama, Masashi
    NEURAL COMPUTATION, 2018, 30 (02) : 477 - 504
  • [33] Asymptotic results for nonparametric regression estimators after sufficient dimension reduction estimation
    Forzani, Liliana
    Rodriguez, Daniela
    Sued, Mariela
    TEST, 2024, : 987 - 1013
  • [34] Sufficient Dimension Reduction via Squared-Loss Mutual Information Estimation
    Suzuki, Taiji
    Sugiyama, Masashi
    NEURAL COMPUTATION, 2013, 25 (03) : 725 - 758
  • [36] Robust dimension reduction
    Chenouri, Shojaeddin
    Liang, Jiaxi
    Small, Christopher G.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2015, 7 (01): : 63 - 69
  • [37] Sufficient Dimension Reduction for Censored Regressions
    Lu, Wenbin
    Li, Lexin
    BIOMETRICS, 2011, 67 (02) : 513 - 523
  • [38] Sufficient dimension reduction with missing predictors
    Li, Lexin
    Lu, Wenbin
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (482) : 822 - 831
  • [39] Sufficient dimension reduction with additional information
    Hung, Hung
    Liu, Chih-Yen
    Lu, Henry Horng-Shing
    BIOSTATISTICS, 2016, 17 (03) : 405 - 421
  • [40] On hierarchical clustering in sufficient dimension reduction
    Yoo, Chaeyeon
    Yoo, Younju
    Um, Hye Yeon
    Yoo, Jae Keun
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2020, 27 (04) : 431 - 443