An Efficient Feature Selection using Hidden Topic in Text Categorization

被引:10
|
作者
Zhang, Zhiwei [1 ]
Phan, Xuan-Hieu [1 ]
Horiguchi, Susumu [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai, Miyagi 980, Japan
来源
2008 22ND INTERNATIONAL WORKSHOPS ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOLS 1-3 | 2008年
关键词
D O I
10.1109/WAINA.2008.137
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Text categorization is an important research area in information retrieval. In order to save the storage space and get better accuracy, efficient and effective feature selection methods for reducing the data before analysis are highly desired Usual v, researches on feature selection use only a proper measurement such as information gain. In this paper, we propose a new feature selection method by adopting an attractive hidden topic analysis and entropy-based feature ranking. Experiments dealing with the well-known Reuters-21578 and Ohsumed datasets show that our method can achieve a better classification accuracy while reducing the feature dimension dramatically.
引用
收藏
页码:1223 / 1228
页数:6
相关论文
共 50 条
  • [31] Applying cascaded feature selection to SVM text categorization
    Masuyama, T
    Nakagawa, H
    13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2002, : 241 - 245
  • [32] Enhancement of DTP feature selection method for text categorization
    Moyotl-Hernández, E
    Jiménez-Salazar, H
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2005, 3406 : 719 - 722
  • [33] Feature selection for support vector machines in text categorization
    Liu, Y
    Lu, HM
    Lu, ZX
    Wang, P
    MLMTA'03: INTERNATIONAL CONFERENCE ON MACHINE LEARNING; MODELS, TECHNOLOGIES AND APPLICATIONS, 2003, : 129 - 134
  • [34] Feature Selection with Structural Sparse Mode for Text Categorization
    Zheng, Wenbin
    Tang, Dan
    Zhang, Haiqing
    Tang, Hong
    2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 359 - 362
  • [35] PKIP: Feature selection in text categorization for item banks
    Nuntiyagul, A
    Naruedomkul, K
    Cercone, N
    Wongsawang, D
    ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 212 - 216
  • [36] Feature subset selection in SOM based text categorization
    Bassiouny, S
    Nagi, M
    Hussein, MF
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 860 - 866
  • [37] A discriminative and semantic feature selection method for text categorization
    Zong, Wei
    Wu, Feng
    Chu, Lap-Keung
    Sculli, Domenic
    INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2015, 165 : 215 - 222
  • [38] Comparison and Improvement of feature selection method for text categorization
    Shan, Li-Li
    Liu, Bing-Quan
    Sun, Cheng-Jie
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2011, 43 (SUPPL. 1): : 319 - 324
  • [39] Maximum entropy modeling with feature selection for text categorization
    Cai, Jihong
    Song, Fei
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 549 - 554
  • [40] Measures of rule quality for feature selection in Text Categorization
    Montañés, E
    Fernández, J
    Díaz, I
    Combarro, EF
    Ranilla, J
    ADVANCES IN INTELLIGENT DATA ANALYSIS V, 2003, 2810 : 589 - 598