OFFSS: Optimal fuzzy-valued feature subset selection

被引:36
|
作者
Tsang, ECC [1 ]
Yeung, DS
Wang, XZ
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Hebei Univ, Coll Math & Comp Sci, Machine Learning Ctr, Baoding, Hebei, Peoples R China
关键词
computational complexity; data mining; feature subset selection; fuzzy-valued feature; learning;
D O I
10.1109/TFUZZ.2003.809895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature subset selection is a well-known pattern recognition problem, which aims to reduce the number of features used in classification or recognition. This reduction is expected to improve the performance of classification algorithms in terms of speed, accuracy and simplicity. Most existing feature selection investigations focus on the case that the feature values are real or nominal, very little research is found to address the fuzzy-valued feature subset selection and its computational complexity. This paper focuses on a problem called optimal fuzzy-valued feature subset selection (OFFSS), in which the quality-measure of a subset of features is defined by both the overall overlapping degree between two classes of examples and the size of feature subset. The main contributions of this paper are that: 1) the concept of fuzzy extension matrix is introduced; 2) the computational complexity of OFFSS is proved to be NP-hard; 3) a simple but powerful heuristic algorithm for OFFSS is given; and 4) the feasibility and simplicity of the proposed algorithm are demonstrated by applications of OFFSS to fuzzy decision tree induction and by comparisons with three different feature selection techniques developed recently.
引用
下载
收藏
页码:202 / 213
页数:12
相关论文
共 50 条