NONPENALIZED VARIABLE SELECTION IN HIGH-DIMENSIONAL LINEAR MODEL SETTINGS VIA GENERALIZED FIDUCIAL INFERENCE

被引:10
|
作者
Williams, Jonathan P. [1 ]
Hannig, Jan [1 ]
机构
[1] Univ N Carolina, Dept Stat & Operat Res, Chapel Hill, NC 27599 USA
来源
ANNALS OF STATISTICS | 2019年 / 47卷 / 03期
基金
美国国家科学基金会;
关键词
Best subset selection; high-dimensional regression; L-0; minimization; feature selection; REGRESSION;
D O I
10.1214/18-AOS1733
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design matrix is collinear. To overcome this challenge, an entirely new perspective on variable selection is presented within a generalized fiducial inference framework. This new procedure is able to effectively account for linear dependencies among subsets of covariates in a high-dimensional setting where p can grow almost exponentially in n, as well as in the classical setting where p <= n. It is shown that the procedure very naturally assigns small probabilities to subsets of covariates which include redundancies by way of explicit L-0 minimization Furthermore, with a typical sparsity assumption, it is shown that the proposed method is consistent in the sense that the probability of the true sparse subset of covariates converges in probability to 1 as n -> infinity, or as n -> infinity and p -> infinity. Very reasonable conditions are needed, and little restriction is placed on the class of possible subsets of covariates to achieve this consistency result.
引用
收藏
页码:1723 / 1753
页数:31
相关论文
共 50 条
  • [21] Variable selection in high-dimensional partly linear additive models
    Lian, Heng
    JOURNAL OF NONPARAMETRIC STATISTICS, 2012, 24 (04) : 825 - 839
  • [22] Estimating the effect of a variable in a high-dimensional linear model
    Jensen, Peter S.
    Wurtz, Allan H.
    ECONOMETRICS JOURNAL, 2012, 15 (02): : 325 - 357
  • [23] Homogeneity detection for the high-dimensional generalized linear model
    Jeon, Jong-June
    Kwon, Sunghoon
    Choi, Hosik
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 114 : 61 - 74
  • [24] Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty
    Hong, Zhaoping
    Hu, Yuao
    Lian, Heng
    METRIKA, 2013, 76 (07) : 887 - 908
  • [25] VARIABLE SELECTION FOR HIGH-DIMENSIONAL GENERALIZED VARYING-COEFFICIENT MODELS
    Lian, Heng
    STATISTICA SINICA, 2012, 22 (04) : 1563 - 1588
  • [26] Variable selection for high-dimensional varying coefficient partially linear models via nonconcave penalty
    Zhaoping Hong
    Yuao Hu
    Heng Lian
    Metrika, 2013, 76 : 887 - 908
  • [27] Bayesian adaptive lasso with variational Bayes for variable selection in high-dimensional generalized linear mixed models
    Dao Thanh Tung
    Minh-Ngoc Tran
    Tran Manh Cuong
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2019, 48 (02) : 530 - 543
  • [28] A Model Selection Criterion for High-Dimensional Linear Regression
    Owrang, Arash
    Jansson, Magnus
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (13) : 3436 - 3446
  • [29] The EAS approach to variable selection for multivariate response data in high-dimensional settings
    Koner, Salil
    Williams, Jonathan P.
    ELECTRONIC JOURNAL OF STATISTICS, 2023, 17 (02): : 1947 - 1995
  • [30] Robust Inference for High-Dimensional Linear Models via Residual Randomization
    Wang, Y. Samuel
    Lee, Si Kai
    Toulis, Panos
    Kolar, Mladen
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7818 - 7828