Naive Bayes for text classification with unbalanced classes

被引:0
|
作者
Frank, Eibe [1 ]
Bouckaert, Remco R.
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
[2] Xtal Mt Informat Technol, Auckland, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multinomial naive Bayes (MNB) is a popular method for document classification due to its computational efficiency and relatively good predictive performance. It has recently been established that predictive performance can be improved further by appropriate data transformations [1,2]. In this paper we present another transformation that is designed to combat a potential problem with the application of MNB to unbalanced datasets. We propose an appropriate correction by adjusting attribute priors. This correction can be implemented as another data normalization step, and we show that it can significantly improve the area under the ROC curve. We also show that the modified version of MNB is very closely related to the simple centroid-based classifier and compare the two methods empirically.
引用
收藏
页码:503 / 510
页数:8
相关论文
共 50 条
  • [1] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    [J]. CEIS 2011, 2011, 15
  • [2] Constrained Naive Bayes with application to unbalanced data classification
    Blanquero, Rafael
    Carrizosa, Emilio
    Ramirez-Cobo, Pepa
    Sillero-Denamiel, M. Remedios
    [J]. CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, 2022, 30 (04) : 1403 - 1425
  • [3] Adapting Hidden Naive Bayes for Text Classification
    Gan, Shengfeng
    Shao, Shiqi
    Chen, Long
    Yu, Liangjun
    Jiang, Liangxiao
    [J]. MATHEMATICS, 2021, 9 (19)
  • [4] Adapting naive Bayes tree for text classification
    Shasha Wang
    Liangxiao Jiang
    Chaoqun Li
    [J]. Knowledge and Information Systems, 2015, 44 : 77 - 89
  • [5] Bayesian Naive Bayes classifiers to text classification
    Xu, Shuo
    [J]. JOURNAL OF INFORMATION SCIENCE, 2018, 44 (01) : 48 - 59
  • [6] Feature selection for text classification with Naive Bayes
    Chen, Jingnian
    Huang, Houkuan
    Tian, Shengfeng
    Qu, Youli
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
  • [7] Adapting naive Bayes tree for text classification
    Wang, Shasha
    Jiang, Liangxiao
    Li, Chaoqun
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 44 (01) : 77 - 89
  • [8] A Technique for Improving the Performance of Naive Bayes Text Classification
    Jiang, Yuqian
    Lin, Huaizhong
    Wang, Xuesong
    Lu, Dongming
    [J]. WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 196 - 203
  • [9] An improved FloatBoost algorithm for Naive Bayes text classification
    Liu, XM
    Yin, JW
    Dong, JX
    Ghafoor, MA
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 162 - 171
  • [10] Some effective techniques for naive Bayes text classification
    Kim, Sang-Bum
    Han, Kyoung-Soo
    Rim, Hae-Chang
    Myaeng, Sung Hyon
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (11) : 1457 - 1466