Robust communication-efficient distributed composite quantile regression and variable selection for massive data

被引:7
|
作者
Wang, Kangning [1 ]
Li, Shaomin [2 ,3 ]
Zhang, Benle [1 ]
机构
[1] Shandong Technol & Business Univ, Sch Stat, Yantai, Peoples R China
[2] Beijing Normal Univ, Ctr Stat & Data Sci, Zhuhai, Peoples R China
[3] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
关键词
Massive data; Robustness; Communication-efficient; Composite quantile regression; Variable selection;
D O I
10.1016/j.csda.2021.107262
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Statistical analysis of massive data is becoming more and more common. Distributed composite quantile regression (CQR) for massive data is proposed in this paper. Specifically, the global CQR loss function is approximated by a surrogate one on the first machine, which relates to the local data only through their gradients, then the estimator is obtained on the first machine by minimizing the surrogate loss. Because the gradients of local datasets can be efficiently communicated, the communication cost is significantly reduced. In order to reduce the computational burdens, the induced smoothing method is applied. Theoretically, the resulting estimator is proved to be statistically as efficient as the global CQR estimator. What is more, as a direct application, a smooth-threshold distributed CQR estimating equations for variable selection is proposed. The new methods inherit the robustness and efficiency advantages of CQR. The promising performances of the new methods are supported by extensive numerical examples and real data analysis. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:26
相关论文
共 50 条
  • [21] Robust distributed modal regression for massive data
    Wang, Kangning
    Li, Shaomin
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 160
  • [22] Byzantine-robust and efficient distributed sparsity learning: a surrogate composite quantile regression approach
    Chen, Canyi
    Zhu, Zhengtian
    [J]. STATISTICS AND COMPUTING, 2024, 34 (05)
  • [23] Variable selection via composite quantile regression with dependent errors
    Tang, Yanlin
    Song, Xinyuan
    Zhu, Zhongyi
    [J]. STATISTICA NEERLANDICA, 2015, 69 (01) : 1 - 20
  • [24] VARIABLE SELECTION IN QUANTILE REGRESSION
    Wu, Yichao
    Liu, Yufeng
    [J]. STATISTICA SINICA, 2009, 19 (02) : 801 - 817
  • [25] Improved composite quantile regression and variable selection with nonignorable dropouts
    Ma, Wei
    Wang, Lei
    [J]. RANDOM MATRICES-THEORY AND APPLICATIONS, 2022, 11 (01)
  • [26] Communication-efficient sparse regression
    Lee, Jason D.
    Liu, Qiang
    Sun, Yuekai
    Taylor, Jonathan E.
    [J]. Journal of Machine Learning Research, 2017, 18 : 1 - 30
  • [27] Communication-Efficient Exact Clustering of Distributed Streaming Data
    Tran, Dang-Hoan
    Sattler, Kai-Uwe
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2013, PT V, 2013, 7975 : 421 - 436
  • [28] Communication-efficient Sparse Regression
    Lee, Jason D.
    Liu, Qiang
    Sun, Yuekai
    Taylor, Jonathan E.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18
  • [29] Communication-efficient exact clustering of distributed streaming data
    Tran, Dang-Hoan
    Sattler, Kai-Uwe
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013, 7971 : 421 - 436
  • [30] Single-index composite quantile regression for massive data
    Jiang, Rong
    Yu, Keming
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2020, 180