Discrimination-based feature selection for multinomial naive Bayes text classification

被引:0
|
作者
Zhu, Jingbo [1 ]
Wang, Huizhen [1 ]
Zhang, Xijuan [1 ]
机构
[1] NE Univ, Nat Language Proc Lab, Inst Comp Software & Theory, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we focus on the problem of class discrimination issues to improve performance of text classification, and study a discrimination-based feature selection technique in which the features are selected based on the criterion of enlarging separation among competing classes, referred to as discrimination capability. The proposed approach discards features with small discrimination capability measured by Gaussian divergence, so as to enhance the robustness and the discrimination power of the text classification system. To evaluation its performance, some comparison experiments of multinomial naive Bayes classifier model are constructed on Newsgroup and Ruters21578 data collection. Experimental results show that on Newsgroup data set divergence measure outperforms MI measure, and has slight better performance than DF measure, and outperforms both measures on Ruters21578 data set. It shows that discrimination-based feature selection method has good contributions to enhance discrimination power of text classification model.
引用
收藏
页码:149 / +
页数:2
相关论文
共 50 条
  • [1] Feature selection for text classification with Naive Bayes
    Chen, Jingnian
    Huang, Houkuan
    Tian, Shengfeng
    Qu, Youli
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
  • [2] Text Classification Based on Naive Bayes Algorithm with Feature Selection
    Chen, Zhenguo
    Shi, Guang
    Wang, Xiaoju
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (10): : 4255 - 4260
  • [3] Divergence-Based Feature Selection for Naive Bayes Text Classification
    Wang, Huizhen
    Zhu, Jingbo
    Su, Keh-Yih
    [J]. IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 209 - +
  • [4] Feature subset selection using naive Bayes for text classification
    Feng, Guozhong
    Guo, Jianhua
    Jing, Bing-Yi
    Sun, Tieli
    [J]. PATTERN RECOGNITION LETTERS, 2015, 65 : 109 - 115
  • [5] Modifying Naive Bayes Classifier for Multinomial Text Classification
    Sharma, Neha
    Singh, Manoj
    [J]. 2016 INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2016,
  • [6] Personality Classification based on Facebook status text using Multinomial Naive Bayes method
    Artissa, Y. B. N. D.
    Asror, I
    Faraby, S. A.
    [J]. 2ND INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE, 2019, 1192
  • [7] Multinomial naive Bayes for text categorization revisited
    Kibriya, AM
    Frank, E
    Pfahringer, B
    Holmes, G
    [J]. AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 488 - 499
  • [8] A New Feature Selection Approach to Naive Bayes Text Classifiers
    Zhang, Lungan
    Jiang, Liangxiao
    Li, Chaoqun
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (02)
  • [9] DEEP FEATURE WEIGHTING IN NAIVE BAYES FOR CHINESE TEXT CLASSIFICATION
    Jiang, Qiaowei
    Wang, Wen
    Han, Xu
    Zhang, Shasha
    Wang, Xinyan
    Wang, Cong
    [J]. PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 160 - 164
  • [10] Toward Optimal Feature Selection in Naive Bayes for Text Categorization
    Tang, Bo
    Kay, Steven
    He, Haibo
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (09) : 2508 - 2521