A Variance-mean Based Feature Selection in Text Classification

被引:3
|
作者
Yin, Shen [1 ]
Jiang, Zongli [1 ]
机构
[1] Beijing Univ Technol, Beijing, Peoples R China
关键词
feature selection; variance-mean; text classification;
D O I
10.1109/ETCS.2009.646
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Feature selection is an important process to choose a subset of features relevant to a particular application in text classification. Based on the mutual information method, we designed variance-mean based feature selection (VM). After computing and ranking the variance of class discrimination value vector for each word, we can choose the most distinguishable features. This method has advantages in the case of choosing smaller number of features, especially for classes with small number of training documents. It keeps the best features, and thus improves the final performance of the classification system. The experiment results indicate the effectiveness of the proposed feature selection method in a text classification.
引用
收藏
页码:519 / 522
页数:4
相关论文
共 50 条
  • [41] Efficient Method for Feature Selection in Text Classification
    Sun, Jian
    Zhang, Xiang
    Liao, Dan
    Chang, Victor
    [J]. 2017 INTERNATIONAL CONFERENCE ON ENGINEERING AND TECHNOLOGY (ICET), 2017,
  • [42] A new feature selection method for text classification
    Uchyigit, Gulden
    Clark, Keith
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2007, 21 (02) : 423 - 438
  • [43] A Bayesian feature selection paradigm for text classification
    Feng, Guozhong
    Guo, Jianhua
    Jing, Bing-Yi
    Hao, Lizhu
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (02) : 283 - 302
  • [44] Feature Selection Method of Text Tendency Classification
    Li, Yanling
    Dai, Guanzhong
    Li, Gang
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 34 - +
  • [45] An enhanced feature selection method for text classification
    Kang, Jinbeom
    Lee, Eunshil
    Hong, Kwanghee
    Park, Jeahyun
    Kim, Taehwan
    Park, Juyoung
    Choi, Joongmin
    Yang, Jaeyoung
    [J]. PROCEEDINGS OF THE SECOND IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2006, : 36 - 41
  • [46] The Research Of Feature Selection Of Text Classification Based On Integrated Learning Algorithm
    Xia Huosong
    Liu Jian
    [J]. 2011 TENTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2011, : 20 - 22
  • [47] A feature selection method based on synonym merging in text classification system
    Haipeng Yao
    Chong Liu
    Peiying Zhang
    Luyao Wang
    [J]. EURASIP Journal on Wireless Communications and Networking, 2017
  • [48] Feature selection based on long short term memory for text classification
    Ming Hong
    Heyong Wang
    [J]. Multimedia Tools and Applications, 2024, 83 : 44333 - 44378
  • [49] Feature selection based on term frequency deviation rate for text classification
    Hongfang Zhou
    Yiming Ma
    Xiang Li
    [J]. Applied Intelligence, 2021, 51 : 3255 - 3274
  • [50] Why Text Segment Classification Based on Part of Speech Feature Selection
    Nagy, Iulia
    Tanaka, Katsuyuki
    Ariki, Yasuo
    [J]. DISCOVERY SCIENCE, DS 2010, 2010, 6332 : 87 - 101