Online updating Huber robust regression for big data streams

被引:1
|
作者
Tao, Chunbai [1 ,2 ]
Wang, Shanshan [1 ,3 ]
机构
[1] Beihang Univ, Sch Econ & Management, Beijing, Peoples R China
[2] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China
[3] Beihang Univ, MOE, Key Lab Complex Syst Anal & Management Decis, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Online updating; Huber regression; big data streams; divide-and-conquer; QUANTILE REGRESSION;
D O I
10.1080/02331888.2024.2398057
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Big data streams have garnered significant attention in multiple industries. However, the immense volume and the presence of outliers in high-velocity streaming data pose great challenges to its analysis. To address these concerns, this paper introduces a novel Online Updating Huber Robust Regression algorithm. By efficiently capturing the salient features of new data subsets, a computationally efficient online updating estimator is proposed without the need for storing historical data. Furthermore, by incorporating Huber regression into its framework, the estimator exhibits robustness to heavy-tailed, heterogeneous as well as outlier-contaminated data. Theoretically, the proposed online updating estimator is asymptotically equivalent to an Oracle estimator derived from the entire dataset. Extensive numerical simulations and a real-world data analysis have been conducted to demonstrate the effectiveness and practicality of the proposed method.
引用
收藏
页码:1197 / 1223
页数:27
相关论文
共 50 条
  • [41] Evidence Updating for Stream-Processing in Big-Data: Robust Conditioning in Soft and Hard Data Fusion Environments
    Wickramarathne, Thanuka
    2017 20TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2017, : 327 - 333
  • [42] Leveraging for big data regression
    Ma, Ping
    Sun, Xiaoxiao
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2015, 7 (01): : 70 - 76
  • [43] Anomaly Detection Guidelines for Data Streams in Big Data
    Rana, Annie Ibrahim
    Estrada, Giovani
    Sole, Marc
    Muntes, Victor
    2016 3RD INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2016), 2016, : 94 - 98
  • [44] A robust algorithm of support vector regression with a trimmed Huber loss function in the primal
    Chen, Chuanfa
    Yan, Changqing
    Zhao, Na
    Guo, Bin
    Liu, Guolin
    SOFT COMPUTING, 2017, 21 (18) : 5235 - 5243
  • [45] A robust algorithm of support vector regression with a trimmed Huber loss function in the primal
    Chuanfa Chen
    Changqing Yan
    Na Zhao
    Bin Guo
    Guolin Liu
    Soft Computing, 2017, 21 : 5235 - 5243
  • [46] Duality in robust linear regression using Huber's M-estimator
    Pinar, MC
    APPLIED MATHEMATICS LETTERS, 1997, 10 (04) : 65 - 70
  • [47] The Regression Learning of the Imbalanced and Big Data by the Online Mixture Model for the Mach Number Forecasting
    Wang, Xiao-Jun
    Liu, Yan
    Yuan, Ping
    Zhou, Chang-Jun
    Zhang, Lin
    IEEE ACCESS, 2019, 7 : 7368 - 7380
  • [48] Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination
    Chen, Sitan
    Koehler, Frederic
    Moitra, Ankur
    Yau, Morris
    2021 IEEE 62ND ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2021), 2022, : 684 - 695
  • [49] Sharp non-asymptotic performance bounds for and Huber robust regression estimators
    Flores, Salvador
    TEST, 2015, 24 (04) : 796 - 812
  • [50] Robust Fused Lasso Penalized Huber Regression with Nonasymptotic Property and Implementation Studies
    Xin, Xin
    Xie, Boyi
    Xiao, Yunhai
    arXiv, 2022,