Robust communication-efficient distributed composite quantile regression and variable selection for massive data

被引:7
|
作者
Wang, Kangning [1 ]
Li, Shaomin [2 ,3 ]
Zhang, Benle [1 ]
机构
[1] Shandong Technol & Business Univ, Sch Stat, Yantai, Peoples R China
[2] Beijing Normal Univ, Ctr Stat & Data Sci, Zhuhai, Peoples R China
[3] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
关键词
Massive data; Robustness; Communication-efficient; Composite quantile regression; Variable selection;
D O I
10.1016/j.csda.2021.107262
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Statistical analysis of massive data is becoming more and more common. Distributed composite quantile regression (CQR) for massive data is proposed in this paper. Specifically, the global CQR loss function is approximated by a surrogate one on the first machine, which relates to the local data only through their gradients, then the estimator is obtained on the first machine by minimizing the surrogate loss. Because the gradients of local datasets can be efficiently communicated, the communication cost is significantly reduced. In order to reduce the computational burdens, the induced smoothing method is applied. Theoretically, the resulting estimator is proved to be statistically as efficient as the global CQR estimator. What is more, as a direct application, a smooth-threshold distributed CQR estimating equations for variable selection is proposed. The new methods inherit the robustness and efficiency advantages of CQR. The promising performances of the new methods are supported by extensive numerical examples and real data analysis. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Communication-efficient sparse composite quantile regression for distributed data
    Yaohong Yang
    Lei Wang
    [J]. Metrika, 2023, 86 : 261 - 283
  • [2] Communication-efficient sparse composite quantile regression for distributed data
    Yang, Yaohong
    Wang, Lei
    [J]. METRIKA, 2023, 86 (03) : 261 - 283
  • [3] Communication-Efficient Modeling with Penalized Quantile Regression for Distributed Data
    Hu, Aijun
    Li, Chujin
    Wu, Jing
    [J]. COMPLEXITY, 2021, 2021
  • [4] Communication-efficient estimation of quantile matrix regression for massive datasets
    Yang, Yaohong
    Wang, Lei
    Liu, Jiamin
    Li, Rui
    Lian, Heng
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 187
  • [5] Unified distributed robust regression and variable selection framework for massive data
    Wang, Kangning
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
  • [6] Communication-efficient surrogate quantile regression for non-randomly distributed system
    Wang, Kangning
    Zhang, Benle
    Alenezi, Fayadh
    Li, Shaomin
    [J]. INFORMATION SCIENCES, 2022, 588 : 425 - 441
  • [7] Robust and smoothing variable selection for quantile regression models with longitudinal data
    Fu, Z. C.
    Fu, L. Y.
    Song, Y. N.
    [J]. JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2023, 93 (15) : 2600 - 2624
  • [8] Communication-efficient estimation of high-dimensional quantile regression
    Wang, Lei
    Lian, Heng
    [J]. ANALYSIS AND APPLICATIONS, 2020, 18 (06) : 1057 - 1075
  • [9] Distributed quantile regression for massive heterogeneous data
    Hu, Aijun
    Jiao, Yuling
    Liu, Yanyan
    Shi, Yueyong
    Wu, Yuanshan
    [J]. NEUROCOMPUTING, 2021, 448 : 249 - 262
  • [10] Communication-Efficient Nonparametric Quantile Regression via Random Features
    Wang, Caixing
    Li, Tao
    Zhang, Xinyi
    Feng, Xingdong
    He, Xin
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2024,