Robust distributed estimation and variable selection for massive datasets via rank regression

被引:1
|
作者
Luan, Jiaming [1 ]
Wang, Hongwei [1 ]
Wang, Kangning [1 ]
Zhang, Benle [1 ]
机构
[1] Shandong Technol & Business Univ, 191 Binhai Middle Rd, Yantai 264005, Peoples R China
关键词
Massive data; Robustness; Communication efficient; Variable selection;
D O I
10.1007/s10463-021-00803-5
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Rank regression is a robust modeling tool; it is challenging to implement it for the distributed massive data owing to memory constraints. In practice, the massive data may be distributed heterogeneously from machine to machine; how to incorporate the heterogeneity is also an interesting issue. This paper proposes a distributed rank regression (DR2), which can be implemented in the master machine by solving a weighted least-squares and adaptive when the data are heterogeneous. Theoretically, we prove that the resulting estimator is statistically as efficient as the global rank regression estimator. Furthermore, based on the adaptive LASSO and a newly defined distributed BIC-type tuning parameter selector, we propose a distributed regularized rank regression (DR3), which can make consistent variable selection and can also be easily implemented by using the LARS algorithm on the master machine. Simulation results and real data analysis are included to validate our method.
引用
收藏
页码:435 / 450
页数:16
相关论文
共 50 条
  • [41] Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression
    Arslan, Olcay
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) : 1952 - 1965
  • [42] Robust estimation and variable selection in heteroscedastic regression model using least favorable distribution
    Yeşim Güney
    Yetkin Tuaç
    Şenay Özdemir
    Olcay Arslan
    [J]. Computational Statistics, 2021, 36 : 805 - 827
  • [43] A Robust Variable Selection Method for Sparse Online Regression via the Elastic Net Penalty
    Wang, Wentao
    Liang, Jiaxuan
    Liu, Rong
    Song, Yunquan
    Zhang, Min
    [J]. MATHEMATICS, 2022, 10 (16)
  • [44] Communication-efficient estimation of quantile matrix regression for massive datasets
    Yang, Yaohong
    Wang, Lei
    Liu, Jiamin
    Li, Rui
    Lian, Heng
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 187
  • [45] Estimation and variable selection in nonparametric heteroscedastic regression
    Yau, P
    Kohn, R
    [J]. STATISTICS AND COMPUTING, 2003, 13 (03) : 191 - 208
  • [46] Estimation and variable selection in nonparametric heteroscedastic regression
    Paul Yau
    Robert Kohn
    [J]. Statistics and Computing, 2003, 13 : 191 - 208
  • [47] The Loss Rank Criterion for Variable Selection in Linear Regression Analysis
    Minh-Ngoc Tran
    [J]. SCANDINAVIAN JOURNAL OF STATISTICS, 2011, 38 (03) : 466 - 479
  • [48] Joint rank and variable selection for parsimonious estimation in a high-dimensional finite mixture regression model
    Devijver, Emilie
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2017, 157 : 1 - 13
  • [49] Variable Selection Linear Regression for Robust Speech Recognition
    Tsao, Yu
    Hu, Ting-Yao
    Sakti, Sakriani
    Nakamura, Satoshi
    Lee, Lin-shan
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06) : 1477 - 1487
  • [50] Quantile regression for robust estimation and variable selection in partially linear varying-coefficient models
    Yang, Jing
    Lu, Fang
    Yang, Hu
    [J]. STATISTICS, 2017, 51 (06) : 1179 - 1199