An Efficient Feature Selection using Hidden Topic in Text Categorization

被引:10
|
作者
Zhang, Zhiwei [1 ]
Phan, Xuan-Hieu [1 ]
Horiguchi, Susumu [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai, Miyagi 980, Japan
关键词
D O I
10.1109/WAINA.2008.137
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Text categorization is an important research area in information retrieval. In order to save the storage space and get better accuracy, efficient and effective feature selection methods for reducing the data before analysis are highly desired Usual v, researches on feature selection use only a proper measurement such as information gain. In this paper, we propose a new feature selection method by adopting an attractive hidden topic analysis and entropy-based feature ranking. Experiments dealing with the well-known Reuters-21578 and Ohsumed datasets show that our method can achieve a better classification accuracy while reducing the feature dimension dramatically.
引用
收藏
页码:1223 / 1228
页数:6
相关论文
共 50 条
  • [1] An efficient feature selection using multi-criteria in text categorization
    Doan, S
    Horiguchi, S
    HIS'04: FOURTH INTERNATIONAL CONFERENCE ON HYBRID INTELLIGENT SYSTEMS, PROCEEDINGS, 2005, : 86 - 91
  • [2] A comparative study on feature selection of text categorization for hidden Markov models
    Yi, K
    Beheshti, J
    CANADIAN JOURNAL OF INFORMATION AND LIBRARY SCIENCE-REVUE CANADIENNE DES SCIENCES DE L INFORMATION ET DE BIBLIOTHECONOMIE, 2004, 28 (03): : 101 - 101
  • [3] Using typical testors for feature selection in text categorization
    Pons-Porratal, Aurora
    Gil-Garcia, Reynaldo
    Berlanga-Liavori, Rafael
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 643 - +
  • [4] Efficient n-gram construction for text categorization using feature selection techniques
    Garcia, Maximiliano
    Maldonado, Sebastian
    Vairetti, Carla
    INTELLIGENT DATA ANALYSIS, 2021, 25 (03) : 509 - 525
  • [5] AN EFFICIENT FEATURE SELECTION METHOD USING NAMED ENTITY RECOGNITION FOR CHINESE TEXT CATEGORIZATION
    Liu, Bin
    Li, Chunping
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 3527 - +
  • [6] Feature selection in SVM text categorization
    Taira, H
    Haruno, M
    SIXTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-99)/ELEVENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE (IAAI-99), 1999, : 480 - 486
  • [7] Feature selection strategies for text categorization
    Soucy, P
    Mineau, GW
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2003, 2671 : 505 - 509
  • [8] Best terms: an efficient feature-selection algorithm for text categorization
    Fragoudis, D
    Meretakis, D
    Likothanassis, S
    KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 8 (01) : 16 - 33
  • [9] Best terms: an efficient feature-selection algorithm for text categorization
    Dimitris Fragoudis
    Dimitris Meretakis
    Spiridon Likothanassis
    Knowledge and Information Systems, 2005, 8 : 16 - 33
  • [10] Naive bayes text categorization using improved feature selection
    Lin, Kunhui
    Kang, Kai
    Huang, Yunping
    Zhou, Changle
    Wang, Beizhan
    Journal of Computational Information Systems, 2007, 3 (03): : 1159 - 1164