Variable selection in finite mixture of regression models

被引:165
|
作者
Khalili, Abbas [1 ]
Chen, Jiahua
机构
[1] Ohio State Univ, Dept Stat, Columbus, OH 43210 USA
[2] Univ British Columbia, Dept Stat, Vancouver, BC V6T 1Z2, Canada
关键词
EM algorithm; LASSO; mixture model; penalty method; SCAD;
D O I
10.1198/016214507000000590
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In the applications of finite mixture of regression (FMR) models, often many covariates are used, and their contributions to the response variable vary from one component to another of the mixture model. This creates a complex variable selection problem. Existing methods, such as the Akaike information criterion and the Bayes information criterion, are computationally expensive as the number of covariates and components in the mixture model increases. In this article we introduce a penalized likelihood approach for variable selection in FMR models. The new method introduces penalties that depend on the size of the regression coefficients and the mixture structure. The new method is shown to be consistent for variable selection. A data-adaptive method for selecting tuning parameters and an EM algorithm for efficient numerical computations are developed. Simulations show that the method performs very well and requires much less computing power than existing methods. The new method is illustrated by analyzing two real data sets.
引用
收藏
页码:1025 / 1038
页数:14
相关论文
共 50 条