Evaluating feature ranking methods in text classifiers

被引:0
|
作者
Makrehchi, Masoud [1 ]
机构
[1] UOIT, Dept Elect Comp & Software Engn, Fac Engn & Appl Sci, Oshawa, ON L1H 7K4, Canada
关键词
Feature selection; supervised learning; text classification; FEATURE-SELECTION; DIMENSIONALITY REDUCTION; PROBABILITY; BOUNDS; TERMS;
D O I
10.3233/IDA-150763
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the major tasks in text categorization systems is dimensionality reduction, which strongly affects classification performance and scalability. Among dimensionality reduction methods, feature ranking-based feature selection, also known as best individual features, is scalable, simple, and inexpensive. However, selecting the proper feature ranking method for a given data set is not obvious without conducting experiments on the given data set. The performance varies depending on the data characteristics and the choice of the classifier. In this paper a framework, which is called feature meta-ranking, is introduced to identify the best feature ranking measure among a set of candidate solutions for a particular text classification problem. The feature meta-ranking technique is implemented based on the differential filter level performance method. This method uses a simple classifier, such as Rocchio, to estimate the behavior of the feature ranking measure with respect to a particular data set. With respect to the use of a classifier in the feature selection loop, the proposed method can be considered as a hybrid feature selection technique with minimal use of a classifier in the loop. The proposed method is evaluated by applying it to six data sets. Seven feature ranking measures are employed and evaluated. The stability of the method in terms of insensitivity to the resolution of filter level is demonstrated. The proposed method is also examined with more sophisticated classifiers such as support vector machines, and the results confirm the performance obtained with simple classifiers.
引用
收藏
页码:1151 / 1170
页数:20
相关论文
共 50 条
  • [1] Embedded Feature Ranking for Ensemble MLP Classifiers
    Windeatt, Terry
    Duangsoithong, Rakkrit
    Smith, Raymond
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2011, 22 (06): : 988 - 994
  • [2] Combining feature ranking for text classification
    Makrehchi, Masoud
    Kamel, Mohamed S.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3003 - 3008
  • [3] Feature ranking fusion for text classifier
    Makrehchi, Masoud
    Kamel, Mohamed S.
    [J]. INTELLIGENT DATA ANALYSIS, 2012, 16 (06) : 879 - 896
  • [4] An unsupervised feature selection algorithm with feature ranking for maximizing performance of the classifiers
    Singh D.A.A.G.
    Balamurugan S.A.A.
    Leavline E.J.
    [J]. International Journal of Automation and Computing, 2015, 12 (05) : 511 - 517
  • [5] An Unsupervised Feature Selection Algorithm with Feature Ranking for Maximizing Performance of the Classifiers
    Danasingh Asir Antony Gnana Singh
    Subramanian Appavu Alias Balamurugan
    Epiphany Jebamalar Leavline
    [J]. Machine Intelligence Research, 2015, 12 (05) : 511 - 517
  • [6] Evaluating the Dynamicity of Feature and Individual Classifiers Selection in Ensembles of Classifiers
    Dantas, Carine A.
    Nunes, Romulo de O.
    Canuto, Anne M. P.
    Xavier-Junior, Joao C., Jr.
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [7] An Efficient Feature Ranking Measure for Text Categorization
    Tan, Songbo
    Wang, Yuefen
    Cheng, Xueqi
    [J]. APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 407 - +
  • [8] Bias and stability of single variable classifiers for feature ranking and selection
    Fakhraei, Shobeir
    Soltanian-Zadeh, Hamid
    Fotouhi, Farshad
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (15) : 6945 - 6958
  • [9] Enhancing Coreference Classifiers using a Ranking-Aware Feature
    Khai Nguyen
    Ichise, Ryutaro
    [J]. 2017 11TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2017, : 53 - 56
  • [10] Analysis of Feature Weighting Methods Based on Feature Ranking Methods for Classification
    Jankowski, Norbert
    Usowicz, Krzysztof
    [J]. NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 238 - 247