Residual projection for quantile regression in vertically partitioned big data

被引:0
|
作者
Fan, Ye [1 ]
Li, Jr-Shin [2 ]
Lin, Nan [3 ]
机构
[1] Capital Univ Econ & Business, Sch Stat, Beijing 100070, Peoples R China
[2] Washington Univ St Louis, Dept Elect & Syst Engn, St Louis, MO 63130 USA
[3] Washington Univ St Louis, Dept Math & Stat, St Louis, MO 63130 USA
关键词
ADMM; Parallel framework; Privacy preservation; Quantile regression; Residual projection; Vertically distributed big data; COORDINATE DESCENT; ALGORITHMS; SELECTION;
D O I
10.1007/s10618-022-00914-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Standard regression techniques model only the mean of the response variable. Quantile regression (QR) is more powerful in that it depicts a comprehensive relationship between the response variable and independent covariates at different quantiles. It is particularly useful for non-normally distributed data with skewness or heterogeneity, which appear routinely in many scientific fields, such as economics, finance, public health and biology. Although its theory has been well developed in the literature, its computation in big data still faces multiple challenges, especially for vertically stored big data in modern distributed environments, where communication efficiency and security are usually the primary considerations. While the popular alternating direction method of multipliers (ADMM) provides a general computational solution, its slow convergence becomes a bottleneck when communication cost dominates local computational consumption, such as Internet of Things (IoT) networks. Motivated by the residual projection technique, in this paper we propose an innovative iterative parallel framework, PIQR, that converges faster and has a more secure data transmission plan, and establish its convergence property. This framework is further extended to composite quantile regression (CQR), a modified QR technique that improves estimation efficiency at extreme quantiles. Simulation studies show that both the ADMM-based method and the PIQR enjoy favorable estimation accuracy in distributed environments. While PIQR is inferior to the ADMM-based method at local computation, it requires much fewer iterations to achieve convergence, and hence significantly improves the overall computational efficiency when communication cost is the dominating factor. Moreover, PIQR transmits only data involving the residual information between different machines, and can better prevent the leakage of important data information compared with the ADMM-based method.
引用
收藏
页码:710 / 735
页数:26
相关论文
共 50 条
  • [1] Residual projection for quantile regression in vertically partitioned big data
    Ye Fan
    Jr-Shin Li
    Nan Lin
    [J]. Data Mining and Knowledge Discovery, 2023, 37 : 710 - 735
  • [2] LOCAL PARTITIONED QUANTILE REGRESSION
    Zhang, Zhengyu
    [J]. ECONOMETRIC THEORY, 2017, 33 (05) : 1081 - 1120
  • [3] ADMM for Penalized Quantile Regression in Big Data
    Yu, Liqun
    Lin, Nan
    [J]. INTERNATIONAL STATISTICAL REVIEW, 2017, 85 (03) : 494 - 518
  • [4] Bayesian Quantile Regression for Big Data Analysis
    Chu, Yuanqi
    Hu, Xueping
    Yu, Keming
    [J]. NEW FRONTIERS IN BAYESIAN STATISTICS, BAYSM 2021, 2022, 405 : 11 - 22
  • [5] Optimal subsampling for quantile regression in big data
    Wang, Haiying
    Ma, Yanyuan
    [J]. BIOMETRIKA, 2021, 108 (01) : 99 - 112
  • [6] Distributed quantile regression for longitudinal big data
    Fan, Ye
    Lin, Nan
    Yu, Liqun
    [J]. COMPUTATIONAL STATISTICS, 2024, 39 (02) : 751 - 779
  • [7] Distributed quantile regression for longitudinal big data
    Ye Fan
    Nan Lin
    Liqun Yu
    [J]. Computational Statistics, 2024, 39 : 751 - 779
  • [8] Privacy-Preserving Logistic Regression on Vertically Partitioned Data
    Song, Lei
    Ma, Chunguang
    Duan, Guanghan
    Yuan, Qi
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (10): : 2243 - 2249
  • [9] Optimal subsampling for composite quantile regression in big data
    Xiaohui Yuan
    Yong Li
    Xiaogang Dong
    Tianqing Liu
    [J]. Statistical Papers, 2022, 63 : 1649 - 1676
  • [10] Optimal subsampling for composite quantile regression in big data
    Yuan, Xiaohui
    Li, Yong
    Dong, Xiaogang
    Liu, Tianqing
    [J]. STATISTICAL PAPERS, 2022, 63 (05) : 1649 - 1676