Inference in Linear Regression Models with Many Covariates and Heteroscedasticity

被引:52
|
作者
Cattaneo, Matias D. [1 ,2 ]
Jansson, Michael [3 ,4 ]
Newey, Whitney K. [5 ]
机构
[1] Univ Michigan, Dept Econ, 611 Tappan St,238 Lorch Hall, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Dept Stat, 611 Tappan St,238 Lorch Hall, Ann Arbor, MI 48109 USA
[3] Univ Calif Berkeley, Dept Econ, Berkeley, CA 94720 USA
[4] Aarhus Univ, CREATES, Aarhus, Denmark
[5] MIT, Dept Econ, Cambridge, MA 02139 USA
基金
新加坡国家研究基金会; 美国国家科学基金会;
关键词
Heteroscedasticity; High-dimensional models; Linear regression; Many regressors; Standard errors; ASYMPTOTIC NORMALITY; MISSPECIFIED MODELS; CONVERGENCE-RATES; ROBUST REGRESSION; WILD BOOTSTRAP; SUBCLASSIFICATION; JACKKNIFE;
D O I
10.1080/01621459.2017.1328360
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The linear regression model is widely used in empirical work in economics, statistics, and many other disciplines. Researchers often include many covariates in their linear model specification in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroscedasticity. Our results are obtained using high-dimensional approximations, where the number of included covariates is allowed to grow as fast as the sample size. We find that all of the usual versions of Eicker-White heteroscedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroscedasticity consistent standard error formula that is fully automatic and robust to both (conditional) heteroscedasticity of unknown form and the inclusion of possibly many covariates. We apply our findings to three settings: parametric linear models with many covariates, linear panel models with many fixed effects, and semiparametric semi-linear models with many technical regressors. Simulation evidence consistent with our theoretical results is provided, and the proposed methods are also illustrated with an empirical application. Supplementary materials for this article are available online.
引用
收藏
页码:1350 / 1361
页数:12
相关论文
共 50 条
  • [21] Heteroscedasticity checks for regression models
    Lixing Zhu
    Yasunori Fujikoshi
    Kanta Naito
    [J]. Science in China Series A: Mathematics, 2001, 44 : 1236 - 1252
  • [22] Sparse linear regression models of high dimensional covariates with non-Gaussian outliers and Berkson error-in-variable under heteroscedasticity
    Wu, Yuh-Jenn
    Cheng, Li-Hsueh
    Fang, Wei-Quan
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (11) : 3146 - 3165
  • [23] Testing for heteroscedasticity in regression models
    Carapeto, M
    Holt, W
    [J]. JOURNAL OF APPLIED STATISTICS, 2003, 30 (01) : 13 - 20
  • [24] Heteroscedasticity checks for regression models
    朱力行
    YasunoriFUJIKOSHI
    KantaNAITO
    [J]. Science China Mathematics, 2001, (10) : 1236 - 1252
  • [25] Statistical inference in nonlinear regression under heteroscedasticity
    Lim, Changwon
    Sen, Pranab K.
    Peddada, Shyamal D.
    [J]. SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS, 2010, 72 (02): : 202 - 218
  • [26] Statistical inference in nonlinear regression under heteroscedasticity
    Changwon Lim
    Pranab K. Sen
    Shyamal D. Peddada
    [J]. Sankhya B, 2010, 72 (2) : 202 - 218
  • [27] Heteroscedasticity checks for regression models
    Zhu, LX
    Fujikoshi, Y
    Naito, K
    [J]. SCIENCE IN CHINA SERIES A-MATHEMATICS PHYSICS ASTRONOMY, 2001, 44 (10): : 1236 - 1252
  • [28] Identification and estimation of partially linear censored regression models with unknown heteroscedasticity
    Zhang, Zhengyu
    Liu, Bing
    [J]. ECONOMETRICS JOURNAL, 2015, 18 (02): : 242 - 273
  • [29] Imputation and variable selection in linear regression models with missing covariates
    Yang, XW
    Belin, TR
    Boscardin, WJ
    [J]. BIOMETRICS, 2005, 61 (02) : 498 - 506
  • [30] LIKELIHOOD INFERENCE FOR LINEAR-REGRESSION MODELS
    DICICCIO, TJ
    [J]. BIOMETRIKA, 1988, 75 (01) : 29 - 34