Online updating Huber robust regression for big data streams

被引:1
|
作者
Tao, Chunbai [1 ,2 ]
Wang, Shanshan [1 ,3 ]
机构
[1] Beihang Univ, Sch Econ & Management, Beijing, Peoples R China
[2] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China
[3] Beihang Univ, MOE, Key Lab Complex Syst Anal & Management Decis, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Online updating; Huber regression; big data streams; divide-and-conquer; QUANTILE REGRESSION;
D O I
10.1080/02331888.2024.2398057
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Big data streams have garnered significant attention in multiple industries. However, the immense volume and the presence of outliers in high-velocity streaming data pose great challenges to its analysis. To address these concerns, this paper introduces a novel Online Updating Huber Robust Regression algorithm. By efficiently capturing the salient features of new data subsets, a computationally efficient online updating estimator is proposed without the need for storing historical data. Furthermore, by incorporating Huber regression into its framework, the estimator exhibits robustness to heavy-tailed, heterogeneous as well as outlier-contaminated data. Theoretically, the proposed online updating estimator is asymptotically equivalent to an Oracle estimator derived from the entire dataset. Extensive numerical simulations and a real-world data analysis have been conducted to demonstrate the effectiveness and practicality of the proposed method.
引用
收藏
页码:1197 / 1223
页数:27
相关论文
共 50 条
  • [31] Online multi-dimensional regression analysis on concept-drifting data streams
    Nadungodage, Chandima Hewa
    Xia, Yuni
    Vaidya, Pranav S.
    Chen, Yu
    Lee, Jaehwan John
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (03) : 217 - 238
  • [32] Robust twin support vector regression based on Huber loss function
    Balasundaram, S.
    Prasad, Subhash Chandra
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (15): : 11285 - 11309
  • [33] Robust regression through the Huber's criterion and adaptive lasso penalty
    Lambert-Lacroix, Sophie
    Zwald, Laurent
    ELECTRONIC JOURNAL OF STATISTICS, 2011, 5 : 1015 - 1053
  • [34] Robust twin support vector regression based on Huber loss function
    S. Balasundaram
    Subhash Chandra Prasad
    Neural Computing and Applications, 2020, 32 : 11285 - 11309
  • [35] State estimation in power engineering using the Huber robust regression technique
    Kyriakides, E
    Suryanarayanan, S
    Heydt, GT
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2005, 20 (02) : 1183 - 1184
  • [36] Robust inversion of seismic data using the Huber norm
    Guitton, A
    Symes, WW
    GEOPHYSICS, 2003, 68 (04) : 1310 - 1319
  • [37] Robust model updating with insufficient data
    Goller, B.
    Pradlwarter, H. J.
    Schueller, G. I.
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2009, 198 (37-40) : 3096 - 3104
  • [38] Robust online updating of a digital twin with imprecise probability
    de Angelis, Marco
    Gray, Ander
    Ferson, Scott
    Patelli, Edoardo
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2023, 186
  • [39] Novel Online Censoring Based Learning Algorithm For Complex-Valued Big Data Streams
    Guvenc, Buket Colak
    Eren, Yusuf
    Menguc, Engin Cemal
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [40] Robust Regression via Online Feature Selection under Adversarial Data Corruption
    Zhang, Xuchao
    Lei, Shuo
    Zhao, Liang
    Boedihardjo, Arnold P.
    Lu, Chang-Tien
    2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 1440 - 1445