ROBUST, SPARSE AND SCALABLE INFERENCE USING BOOTSTRAP AND VARIABLE SELECTION FUSION

被引:0
|
作者
Mozafari-Majd, Emadaldin [1 ]
Koivunen, Visa [1 ]
机构
[1] Aalto Univ, Dept Signal Proc & Acoust, POB 15400, FU-0076 Aalto, Finland
基金
芬兰科学院;
关键词
statistical inference; robust; bootstrap; sparsity; high-dimensional; large-scale; REGRESSION; RIDGE;
D O I
10.1109/camsap45676.2019.9022472
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we address the challenging problem of conducting statistical inference for large-scale data sets in the presence of sparsity and outlying observations. In particular, processing and storing such data on a single computing node may be infeasible due to its high volume and dimensionality. Therefore, the large-scale data is subdivided into smaller distinct subsets that may be stored and processed in different nodes. We propose a robust and scalable statistical inference method using a two-stage algorithm where variable selection is performed via fusing the selected support from each distinct subset of data. The actual parameter and confidence interval estimation takes place in the second stage using a robust extension of Bag of Little Bootstraps (BLB) technique. In order to exploit sparsity and ensure robustness, MM-Lasso estimator is used to select variables for each subset of data. The selections are then fused to find the support for the original large-scale data. In the second stage, the robust MM-estimator is used for the selected support. The simulation studies demonstrated the highly reliable performance of the algorithm in variable selection and providing reliable confidence intervals even if the estimation problem in the subsets is slightly under-determined.
引用
收藏
页码:271 / 275
页数:5
相关论文
共 50 条
  • [31] Fast and Robust Bootstrap for Multivariate Inference: The R Package FRB
    Van Aelst, Stefan
    Willems, Gert
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2013, 53 (03): : 1 - 32
  • [32] Cluster-Robust Bootstrap Inference in Quantile Regression Models
    Hagemann, Andreas
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (517) : 446 - 456
  • [33] CALIBRATED PERCENTILE DOUBLE BOOTSTRAP FOR ROBUST LINEAR REGRESSION INFERENCE
    McCarthy, Daniel
    Zhang, Kai
    Brown, Lawrence D.
    Berk, Richard
    Buja, Andreas
    George, Edward, I
    Zhao, Linda
    [J]. STATISTICA SINICA, 2018, 28 (04) : 2565 - 2589
  • [34] Haplotype Inference based on Sparse Dictionary Selection
    Jajamovich, Guido H.
    Wang, Xiaodong
    [J]. 2011 CONFERENCE RECORD OF THE FORTY-FIFTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS (ASILOMAR), 2011, : 1021 - 1025
  • [35] Partitioning Sparse Deep Neural Networks for Scalable Training and Inference
    Demirci, Gunduz Vehbi
    Ferhatosmanoglu, Hakan
    [J]. PROCEEDINGS OF THE 2021 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, ICS 2021, 2021, : 254 - 265
  • [36] Variable selection for sparse logistic regression
    Zanhua Yin
    [J]. Metrika, 2020, 83 : 821 - 836
  • [37] Variable selection for sparse logistic regression
    Yin, Zanhua
    [J]. METRIKA, 2020, 83 (07) : 821 - 836
  • [38] Variable selection in sparse GLARMA models
    Gomtsyan, Marina
    Levy-Leduc, Celine
    Ouadah, Sarah
    Sansonnet, Laure
    Blein, Thomas
    [J]. STATISTICS, 2022, 56 (04) : 755 - 784
  • [39] Robust inference in sample selection models
    Zhelonkin, Mikhail
    Genton, Marc G.
    Ronchetti, Elvezio
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2016, 78 (04) : 805 - 827
  • [40] DISTRIBUTIONALLY ROBUST SPARSE PORTFOLIO SELECTION
    Sheng, Xiwen
    Zhang, Beibei
    Cheng, Yonghui
    Luan, Dongqing
    Ji, Ying
    [J]. MATHEMATICAL FOUNDATIONS OF COMPUTING, 2023,