Integrating incremental feature weighting into Naive Bayes text classifier

被引:0
|
作者
Kim, Han Joon [1 ]
Chang, Jaeyoung [1 ]
机构
[1] Univ Seoul, Dept Elect & Comp Engn, Seoul, South Korea
关键词
text classification; Naive Bayes classifier; feature weighting; feature selection; X-2-statistic;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the real-world operational environment, text classification systems should handle the problem of incomplete training set and no prior knowledge of feature space. In this regard, the most appropriate algorithm for operational text classification is the Naive Bayes since it is easy to incrementally update its pre-learned classification model and feature space. Our work mainly focuses on improving Naive Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of Naive Bayes can consider the degree of feature importance as well as feature distribution. In addition, we have extended a conventional algorithm for incremental feature update for developing a dynamic feature space in operational environment. Through experiments using the Reuters-21578 and the 20Newsgroup benchmark collections, we show that the traditional multinomial Naive Bayes classifier can be significantly improved by X-2-statistic based feature weighting.
引用
收藏
页码:1137 / 1143
页数:7
相关论文
共 50 条
  • [1] Advanced Naive Bayes Text Classifier with Embedded Feature Weighting Approach
    Kim, Han-joon
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2009, 12 (03): : 607 - 620
  • [2] Two feature weighting approaches for naive Bayes text classifiers
    Zhang, Lungan
    Jiang, Liangxiao
    Li, Chaoqun
    Kong, Ganggang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 100 : 137 - 144
  • [3] DEEP FEATURE WEIGHTING IN NAIVE BAYES FOR CHINESE TEXT CLASSIFICATION
    Jiang, Qiaowei
    Wang, Wen
    Han, Xu
    Zhang, Shasha
    Wang, Xinyan
    Wang, Cong
    [J]. PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 160 - 164
  • [4] Naive Bayes text classifier
    Zhang, Haiyi
    Li, Di
    [J]. GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 708 - 711
  • [5] Deep feature weighting for naive Bayes and its application to text classification
    Jiang, Liangxiao
    Li, Chaoqun
    Wang, Shasha
    Zhang, Lungan
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 26 - 39
  • [6] Speeding up incremental wrapper feature subset selection with Naive Bayes classifier
    Bermejo, Pablo
    Gamez, Jose A.
    Puerta, Jose M.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 55 : 140 - 147
  • [7] Incremental discretization for Naive-Bayes classifier
    Lu, Jingli
    Yang, Ying
    Webb, Geoffrey I.
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 223 - 238
  • [8] Class dependent feature scaling method using naive Bayes classifier for text datamining
    Youn, Eunseog
    Jeong, Myong K.
    [J]. PATTERN RECOGNITION LETTERS, 2009, 30 (05) : 477 - 485
  • [9] Integrating Global and Local Application of Naive Bayes Classifier
    Kotsiantis, Sotiris
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2014, 11 (03) : 300 - 307
  • [10] Estimating a one -class naive Bayes text classifier
    Zhang, Yihong
    Jatowt, Adam
    [J]. INTELLIGENT DATA ANALYSIS, 2020, 24 (03) : 567 - 579