THE PHASE TRANSITION FOR THE EXISTENCE OF THE MAXIMUM LIKELIHOOD ESTIMATE IN HIGH-DIMENSIONAL LOGISTIC REGRESSION

被引:51
|
作者
Candes, Emmanuel J. [1 ,2 ]
Sur, Pragya [2 ]
机构
[1] Stanford Univ, Dept Math, Stanford, CA 94305 USA
[2] Harvard Univ, Harvard John A Paulson Sch Engn & Appl Sci, Cambridge, MA 02138 USA
来源
ANNALS OF STATISTICS | 2020年 / 48卷 / 01期
关键词
High-dimensional logistic regression; MLE phase transition;
D O I
10.1214/18-AOS1789
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp "phase transition." We introduce an explicit boundary curve h(MLE), parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes n and number of features p proportioned in such a way that p/n -> kappa, we show that if the problem is sufficiently high dimensional in the sense that kappa > h(MLE), then the MLE does not exist with probability one. Conversely, if kappa < h(MLE), the MLE asymptotically exists with probability one.
引用
收藏
页码:27 / 42
页数:16
相关论文
共 50 条