Feature screening is a popular and efficient statistical technique in processing ultrahigh-dimensional data. When a regression model consists both categorical and continuous predictors, a unified feature screening procedure is needed. Thus, we propose a unified mean-variance sure independence screening (UMV-SIS) for this setup. The mean-variance (MV), an effective utility to measure the dependence between two random variables, is widely used in feature screening for discriminant analysis. In this paper, we advocate using the kernel smoothing method to estimate MV between two continuous variables, thereby extending it to screen categorical and continuous predictors simultaneously. Besides the uniformity for screening, UMV-SIS is a model-free procedure without any specification of a regression model; this broadens the scope of its application. In theory, we show that the UMV-SIS procedure has the sure screening and ranking consistency properties under mild conditions. To solve some difficulties in marginal feature screening for linear model and further enhance the screening performance of our proposed method, an iterative UMV-SIS procedure is developed. The promising performances of the new method are supported by extensive numerical examples.
机构:
Department of Statistics, East China Normal University
Department of Mathematics,Taiyuan University of TechnologyDepartment of Statistics, East China Normal University
ZHANG Junying
ZHANG Riquan
论文数: 0引用数: 0
h-index: 0
机构:
Department of Statistics, East China Normal University
Department of Mathematics,Shanxi Datong UniversityDepartment of Statistics, East China Normal University
ZHANG Riquan
ZHANG Jiajia
论文数: 0引用数: 0
h-index: 0
机构:
Department of Epidemiology and Biostatistics, University of South CarolinaDepartment of Statistics, East China Normal University
机构:
Jinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Guangdong, Peoples R ChinaJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Guangdong, Peoples R China
Yang, Guangren
Zhang, Ling
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USAJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Guangdong, Peoples R China
Zhang, Ling
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USAJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Guangdong, Peoples R China
Li, Runze
Huang, Yuan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Iowa, Dept Biostat, Iowa City, IA 52242 USAJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Guangdong, Peoples R China
机构:
Jinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Peoples R ChinaJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Peoples R China
Yang, Guangren
Yang, Songshan
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USAJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Peoples R China
Yang, Songshan
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USAJinan Univ, Sch Econ, Dept Stat, Guangzhou 510632, Peoples R China
机构:
Univ Colorado, Dept Biostat & Informat, Colorado Sch Publ Hlth, Anschutz Med Campus, Aurora, CO 80045 USAUniv Colorado, Dept Biostat & Informat, Colorado Sch Publ Hlth, Anschutz Med Campus, Aurora, CO 80045 USA
Nandy, Debmalya
Chiaromonte, Francesca
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
St Anna Sch Adv Studies, Inst Econ, Pisa, Italy
St Anna Sch Adv Studies, EMbeDS, Pisa, ItalyUniv Colorado, Dept Biostat & Informat, Colorado Sch Publ Hlth, Anschutz Med Campus, Aurora, CO 80045 USA
Chiaromonte, Francesca
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USAUniv Colorado, Dept Biostat & Informat, Colorado Sch Publ Hlth, Anschutz Med Campus, Aurora, CO 80045 USA