Distributed adaptive Huber regression

被引:9
|
作者
Luo, Jiyu [1 ]
Sun, Qiang [2 ]
Zhou, Wen-Xin [3 ]
机构
[1] Univ Calif San Diego, Herbert Wertheim Sch Publ Hlth & Human Longev Sci, Div Biostat, San Diego, CA 92093 USA
[2] Univ Toronto, Dept Stat Sci, Toronto, ON M5S 3G3, Canada
[3] Univ Calif San Diego, Dept Math, La Jolla, CA 92093 USA
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
Adaptive Huber regression; Communication efficiency; Distributed inference; Heavy-tailed distribution; Nonasymptotic analysis; ROBUST REGRESSION; QUANTILE REGRESSION; M-ESTIMATORS; ASYMPTOTIC-BEHAVIOR; LINEAR-REGRESSION; PARAMETERS;
D O I
10.1016/j.csda.2021.107419
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Distributed data naturally arise in scenarios involving multiple sources of observations, each stored at a different location. Directly pooling all the data together is often prohibited due to limited bandwidth and storage, or due to privacy protocols. A new robust distributed algorithm is introduced for fitting linear regressions when data are subject to heavy-tailed and/or asymmetric errors with finite second moments. The algorithm only communicates gradient information at each iteration, and therefore is communication-efficient. To achieve the bias-robustness tradeoff, the key is a novel double-robustification approach that applies on both the local and global objective functions. Statistically, the resulting estimator achieves the centralized nonasymptotic error bound as if all the data were pooled together and came from a distribution with sub-Gaussian tails. Under a finite (2 + delta)-th moment condition, a Berry-Esseen bound for the distributed estimator is established, based on which robust confidence intervals are constructed. In high dimensions, the proposed doubly-robustified loss function is complemented with l(1) -penalization for fitting sparse linear models with distributed data. Numerical studies further confirm that compared with extant distributed methods, the proposed methods achieve near-optimal accuracy with low variability and better coverage with tighter confidence width. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Adaptive Huber Regression
    Sun, Qiang
    Zhou, Wen-Xin
    Fan, Jianqing
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (529) : 254 - 265
  • [2] Adaptive Huber regression on Markov-dependent data
    Fan, Jianqing
    Guo, Yongyi
    Jiang, Bai
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2022, 150 : 802 - 818
  • [3] Adaptive Huber regression on Markov-dependent data
    Fan, Jianqing
    Guo, Yongyi
    Jiang, Bai
    Stochastic Processes and their Applications, 2022, 150 : 802 - 818
  • [4] Double debiased transfer learning for adaptive Huber regression
    Wang, Ziyuan
    Wang, Lei
    Lian, Heng
    SCANDINAVIAN JOURNAL OF STATISTICS, 2024, 51 (04) : 1472 - 1505
  • [5] Robust regression through the Huber's criterion and adaptive lasso penalty
    Lambert-Lacroix, Sophie
    Zwald, Laurent
    ELECTRONIC JOURNAL OF STATISTICS, 2011, 5 : 1015 - 1053
  • [6] Enveloped Huber Regression
    Zhou, Le
    Cook, R. Dennis
    Zou, Hui
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) : 2722 - 2732
  • [7] Large Scale Huber Regression
    Lei, Dajiang
    Jiang, Zhijie
    Du, Meng
    Chen, Hao
    Wu, Yu
    2018 IEEE 8TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER), 2018, : 1282 - 1287
  • [8] A VARIANT OF HUBER ROBUST REGRESSION
    BONCELET, CG
    DICKINSON, BW
    SIAM JOURNAL ON SCIENTIFIC AND STATISTICAL COMPUTING, 1984, 5 (03): : 720 - 734
  • [9] Deep Huber quantile regression networks
    Tyralis, Hristos
    Papacharalampous, Georgia
    Dogulu, Nilay
    Chun, Kwok P.
    NEURAL NETWORKS, 2025, 187
  • [10] On pairing Huber support vector regression
    Balasundaram, S.
    Prasad, Subhash Chandra
    APPLIED SOFT COMPUTING, 2020, 97