Feature Selection Method Based on Crossed Centroid for Text Categorization

被引:0
|
作者
Yang, Jieming [1 ]
Liu, Zhiying [1 ]
Qu, Zhaoyang [1 ]
Wang, Jing [1 ]
机构
[1] Northeast Dianli Univ, Sch Informat Engn, Jilin, Jilin, Peoples R China
关键词
feature selection; text categorization; across centroid; high dimension; ALGORITHM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The most important characteristic of text categorization is the high dimensionality even for the moderate size dataset. Feature selection, which can reduce the size of the dimensionality without sacrificing the performance of the categorization and avoid over-fitting, is a commonly used approach in dimensionality reduction. In this paper, we proposed a new feature selection, which evaluates the deviation from the centroid based on both inter-category and intra-category. We compared the proposed method with four well-known feature selection algorithms using support vector machines on three benchmark datasets (20-newgroups, reuters-21578 and webkb). The experimental results show that the proposed method can significantly improve the performance of the classifier.
引用
下载
收藏
页码:11 / 15
页数:5
相关论文
共 50 条
  • [21] A NEW FEATURE SELECTION METHOD FOR TEXT CATEGORIZATION BASED ON INFORMATION GAIN AND PARTICLE SWARM OPTIMIZATION
    Yigit, Ferruh
    Baykan, Omer Kaan
    2014 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (CCIS), 2014, : 523 - 529
  • [22] Normalized and classified feature selection in text categorization
    Wang, XJ
    Guo, J
    Zheng, KF
    INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2005, VOLS 1 AND 2, PROCEEDINGS, 2005, : 173 - 176
  • [23] Improving Text Categorization by Multicriteria Feature Selection
    Doan, Son
    Horiguchi, Susumu
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2005, 9 (05) : 570 - 575
  • [24] A novel feature selection algorithm for text categorization
    Shang, Wenqian
    Huang, Houkuan
    Zhu, Haibin
    Lin, Yongmin
    Qu, Youli
    Wang, Zhihai
    EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (01) : 1 - 5
  • [25] Study on Feature Selection in Finance Text Categorization
    Sun, Changqiu
    Wang, Xiaolong
    Xu, Jun
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 5077 - 5082
  • [26] Study on constraints for feature selection in text categorization
    Xu, Yan
    Li, Jintao
    Wang, Bin
    Sun, Chunming
    Zhang, Sen
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2008, 45 (04): : 596 - 602
  • [27] Words as rules:: Feature selection in text categorization
    Montañés, E
    Combarro, EF
    Díaz, I
    Ranilla, J
    Quevedo, JR
    COMPUTATIONAL SCIENCE - ICCS 2004, PT 1, PROCEEDINGS, 2004, 3036 : 666 - 669
  • [28] A new approach to feature selection for text categorization
    Li, SS
    Zong, CQ
    PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (IEEE NLP-KE'05), 2005, : 626 - 630
  • [29] Cascaded feature selection in SVMs text categorization
    Masuyama, T
    Nakagawa, H
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 588 - 591
  • [30] A General Framework of Feature Selection for Text Categorization
    Jing, Hongfang
    Wang, Bin
    Yang, Yahui
    Xu, Yan
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 647 - +