This paper addresses the issue of feature selection for linear classifiers given the moments of the class conditional densities. The problem is posed as finding a minimal set of features such that the resulting classifier has a low misclassification error. Using a bound on the misclassification error involving the mean and covariance of class conditional densities and minimizing an L-1 norm as an approximate criterion for feature selection, a second order programming formulation is derived. To handle errors in estimation of mean and covariances, a tractable robust formulation is also discussed. In a slightly different setting the Fisher discriminant is derived. Feature selection for Fisher discriminant is also discussed. Experimental results on synthetic data sets and on real life microarray data show that the proposed formulations are competitive with the state of the art linear programming formulation.
机构:
College of Mathematics and System Science, Xinjiang University, 830046, UrumchiCollege of Mathematics and System Science, Xinjiang University, 830046, Urumchi
Yang Z.
Tian Y.
论文数: 0引用数: 0
h-index: 0
机构:
Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, 100080, BeijingCollege of Mathematics and System Science, Xinjiang University, 830046, Urumchi
机构:
China Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R ChinaChina Univ Min & Technol, Sch Comp Sci & Technol, Xuzhou 221116, Peoples R China