Discrimination-based feature selection for multinomial naive Bayes text classification

被引:0
|
作者
Zhu, Jingbo [1 ]
Wang, Huizhen [1 ]
Zhang, Xijuan [1 ]
机构
[1] NE Univ, Nat Language Proc Lab, Inst Comp Software & Theory, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we focus on the problem of class discrimination issues to improve performance of text classification, and study a discrimination-based feature selection technique in which the features are selected based on the criterion of enlarging separation among competing classes, referred to as discrimination capability. The proposed approach discards features with small discrimination capability measured by Gaussian divergence, so as to enhance the robustness and the discrimination power of the text classification system. To evaluation its performance, some comparison experiments of multinomial naive Bayes classifier model are constructed on Newsgroup and Ruters21578 data collection. Experimental results show that on Newsgroup data set divergence measure outperforms MI measure, and has slight better performance than DF measure, and outperforms both measures on Ruters21578 data set. It shows that discrimination-based feature selection method has good contributions to enhance discrimination power of text classification model.
引用
收藏
页码:149 / +
页数:2
相关论文
共 50 条
  • [31] Adapting Hidden Naive Bayes for Text Classification
    Gan, Shengfeng
    Shao, Shiqi
    Chen, Long
    Yu, Liangjun
    Jiang, Liangxiao
    [J]. MATHEMATICS, 2021, 9 (19)
  • [32] Adapting naive Bayes tree for text classification
    Shasha Wang
    Liangxiao Jiang
    Chaoqun Li
    [J]. Knowledge and Information Systems, 2015, 44 : 77 - 89
  • [33] Bayesian Naive Bayes classifiers to text classification
    Xu, Shuo
    [J]. JOURNAL OF INFORMATION SCIENCE, 2018, 44 (01) : 48 - 59
  • [34] Adapting naive Bayes tree for text classification
    Wang, Shasha
    Jiang, Liangxiao
    Li, Chaoqun
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 44 (01) : 77 - 89
  • [35] Naive Bayes for text classification with unbalanced classes
    Frank, Eibe
    Bouckaert, Remco R.
    [J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2006, PROCEEDINGS, 2006, 4213 : 503 - 510
  • [36] Feature selection for optimizing the Naive Bayes algorithm
    Winarti, Titin
    Vydia, Vensy
    [J]. ENGINEERING, INFORMATION AND AGRICULTURAL TECHNOLOGY IN THE GLOBAL DIGITAL REVOLUTION, 2020, : 47 - 51
  • [37] Human action recognition based on boosted feature selection and naive Bayes nearest-neighbor classification
    Liu, Li
    Shao, Ling
    Rockett, Peter
    [J]. SIGNAL PROCESSING, 2013, 93 (06) : 1521 - 1530
  • [38] A method of the feature selection in hierarchical text classification based on the category discrimination and position information
    Song, Jia
    Zhang, Pengzhou
    Qin, Sijun
    Gong, Junpeng
    [J]. 2015 INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2015, : 132 - +
  • [39] Chinese News Text Multi Classification Based on Naive Bayes Algorithm
    Wang, Fei
    Deng, Xin
    Hou, Lunqing
    [J]. ISCSIC'18: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, 2018,
  • [40] Semi-Supervised Multinomial Naive Bayes for Text Classification by Leveraging Word-Level Statistical Constraint
    Zhao, Li
    Huang, Minlie
    Yao, Ziyu
    Su, Rongwei
    Jiang, Yingying
    Zhu, Xiaoyan
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2877 - 2883