Online inference in high-dimensional generalized linear models with streaming data

被引:0
|
作者
Luo, Lan [1 ]
Han, Ruijian [2 ]
Lin, Yuanyuan [3 ]
Huang, Jian [2 ]
机构
[1] Rutgers Sch Publ Hlth, Dept Biostat & Epidemiol, Piscataway, NJ USA
[2] Hong Kong Polytech Univ, Dept Appl Math, Hong Kong, Peoples R China
[3] Chinese Univ Hong Kong, Dept Stat, Hong Kong, Peoples R China
来源
ELECTRONIC JOURNAL OF STATISTICS | 2023年 / 17卷 / 02期
基金
中国国家自然科学基金;
关键词
Confidence interval; generalized linear models; online debiased lasso; high-dimensional data; CONFIDENCE-INTERVALS; THRESHOLDING ALGORITHM; VARIABLE SELECTION; SHRINKAGE;
D O I
10.1214/23-EJS2182
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper we develop an online statistical inference approach for high-dimensional generalized linear models with streaming data for realtime estimation and inference. We propose an online debiased lasso method that aligns with the data collection scheme of streaming data. Online debiased lasso differs from offline debiased lasso in two important aspects. First, it updates component-wise confidence intervals of regression coefficients with only summary statistics of the historical data. Second, online debiased lasso adds an additional term to correct approximation errors accumulated throughout the online updating procedure. We show that our proposed online debiased estimators in generalized linear models are asymptotically normal. This result provides a theoretical basis for carrying out real-time interim statistical inference with streaming data. Extensive numerical experiments are conducted to evaluate the performance of our proposed online debiased lasso method. These experiments demonstrate the effectiveness of our algorithm and support the theoretical results. Furthermore, we illustrate the application of our method with a high-dimensional text dataset.
引用
收藏
页码:3443 / 3471
页数:29
相关论文
共 50 条
  • [1] High-Dimensional Inference for Generalized Linear Models with Hidden Confounding
    Ouyang, Jing
    Tan, Kean Ming
    Xu, Gongjun
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [2] Inference on High-dimensional Single-index Models with Streaming Data
    Han, Dongxiao
    Xie, Jinhan
    Liu, Jin
    Sun, Liuquan
    Huang, Jian
    Jiang, Bei
    Kong, Linglong
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [3] Empirical Bayes inference in sparse high-dimensional generalized linear models
    Tang, Yiqi
    Martin, Ryan
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (02): : 3212 - 3246
  • [4] Statistical Inference for High-Dimensional Generalized Linear Models With Binary Outcomes
    Cai, T. Tony
    Guo, Zijian
    Ma, Rong
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (542) : 1319 - 1332
  • [5] Estimation and Inference for High-Dimensional Generalized Linear Models with Knowledge Transfer
    Li, Sai
    Zhang, Linjun
    Cai, T. Tony
    Li, Hongzhe
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (546) : 1274 - 1285
  • [6] Bias-Corrected Inference of High-Dimensional Generalized Linear Models
    Tang, Shengfei
    Shi, Yanmei
    Zhang, Qi
    MATHEMATICS, 2023, 11 (04)
  • [7] Generalized autoregressive linear models for discrete high-dimensional data
    Pandit P.
    Sahraee-Ardakan M.
    Amini A.A.
    Rangan S.
    Fletcher A.K.
    IEEE Journal on Selected Areas in Information Theory, 2020, 1 (03): : 884 - 896
  • [8] Simultaneous Inference for High-Dimensional Linear Models
    Zhang, Xianyang
    Cheng, Guang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) : 757 - 768
  • [9] High-dimensional inference in misspecified linear models
    Buehlmann, Peter
    van de Geer, Sara
    ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01): : 1449 - 1473
  • [10] AN ADAPTIVELY RESIZED PARAMETRIC BOOTSTRAP FOR INFERENCE IN HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS
    Zhao, Qian
    Candes, Emmanuel J.
    STATISTICA SINICA, 2025, 35 (01) : 91 - 110