Resampling-based information criteria for best-subset regression

被引:0
|
作者
Philip T. Reiss
Lei Huang
Joseph E. Cavanaugh
Amy Krain Roy
机构
[1] New York University School of Medicine,Department of Child and Adolescent Psychiatry
[2] Nathan S. Kline Institute for Psychiatric Research,Department of Biostatistics
[3] Johns Hopkins Bloomberg School of Public Health,Department of Biostatistics
[4] University of Iowa College of Public Health,Department of Psychology
[5] Fordham University,undefined
关键词
Adaptive model selection; Covariance inflation criterion; Cross-validation; Extended information criterion; Functional connectivity; Overoptimism;
D O I
暂无
中图分类号
学科分类号
摘要
When a linear model is chosen by searching for the best subset among a set of candidate predictors, a fixed penalty such as that imposed by the Akaike information criterion may penalize model complexity inadequately, leading to biased model selection. We study resampling-based information criteria that aim to overcome this problem through improved estimation of the effective model dimension. The first proposed approach builds upon previous work on bootstrap-based model selection. We then propose a more novel approach based on cross-validation. Simulations and analyses of a functional neuroimaging data set illustrate the strong performance of our resampling-based methods, which are implemented in a new R package.
引用
收藏
页码:1161 / 1186
页数:25
相关论文
共 50 条
  • [1] Resampling-based information criteria for best-subset regression
    Reiss, Philip T.
    Huang, Lei
    Cavanaugh, Joseph E.
    Roy, Amy Krain
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2012, 64 (06) : 1161 - 1186
  • [2] BEST-SUBSET SELECTION PROCEDURE
    Wang, Yu
    Luangkesorn, Louis
    Shuman, Larry J.
    PROCEEDINGS OF THE 2011 WINTER SIMULATION CONFERENCE (WSC), 2011, : 4310 - 4318
  • [3] RESAMPLING-BASED ESTIMATOR IN NONLINEAR-REGRESSION
    MONG, J
    WANG, XR
    STATISTICA SINICA, 1994, 4 (01) : 187 - 198
  • [4] Branch-and-bound algorithms for computing the best-subset regression models
    Gatu, C
    Kontoghiorghes, EJ
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2006, 15 (01) : 139 - 156
  • [5] Response best-subset selector for multivariate regression with high-dimensional response variables
    Hu, Jianhua
    Huang, Jian
    Liu, Xiaoqian
    Liu, Xu
    BIOMETRIKA, 2023, 110 (01) : 205 - 223
  • [6] A polynomial algorithm for best-subset selection problem
    Zhu, Junxian
    Wen, Canhong
    Zhu, Jin
    Zhang, Heping
    Wang, Xueqin
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (52) : 33117 - 33123
  • [7] Subsampling versus Bootstrapping in Resampling-Based Model Selection for Multivariable Regression
    De Bin, Riccardo
    Janitza, Silke
    Sauerbrei, Willi
    Boulesteix, Anne-Laure
    BIOMETRICS, 2016, 72 (01) : 272 - 280
  • [8] Resampling-based calculation of the information matrix for general identification problems
    Spall, JC
    PROCEEDINGS OF THE 1998 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 1998, : 3194 - 3198
  • [9] Resampling-based inferences for compositional regression with application to beef cattle microbiomes
    Lee, Sujin
    Jung, Sungkyu
    Lourenco, Jeferson
    Pringle, Dean
    Ahn, Jeongyoun
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2023, 32 (01) : 151 - 164
  • [10] A Heuristic Approach for Selecting Best-Subset Including Ranking Within the Subset
    Choi, Seon Han
    Kim, Tag Gon
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (10): : 3852 - 3862