Bayesian Naive Bayes classifiers to text classification

被引:157
|
作者
Xu, Shuo [1 ]
机构
[1] Inst Sci & Tech Informat China, Res Ctr Informat Sci Theory & Methodol, 15 Fuxing Rd, Beijing 100038, Peoples R China
基金
美国国家科学基金会;
关键词
Bayesian Naive Bayes classifier; event model; Naive Bayes classifier; text classification; DECISION;
D O I
10.1177/0165551516677946
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text classification is the task of assigning predefined categories to natural language documents, and it can provide conceptual views of document collections. The Naive Bayes (NB) classifier is a family of simple probabilistic classifiers based on a common assumption that all features are independent of each other, given the category variable, and it is often used as the baseline in text classification. However, classical NB classifiers with multinomial, Bernoulli and Gaussian event models are not fully Bayesian. This study proposes three Bayesian counterparts, where it turns out that classical NB classifier with Bernoulli event model is equivalent to Bayesian counterpart. Finally, experimental results on 20 newsgroups and WebKB data sets show that the performance of Bayesian NB classifier with multinomial event model is similar to that of classical counterpart, but Bayesian NB classifier with Gaussian event model is obviously better than classical counterpart.
引用
收藏
页码:48 / 59
页数:12
相关论文
共 50 条
  • [1] A Visual Tool for Bayesian Data Analysis: The Impact of Smoothing on Naive Bayes Text Classifiers
    Di Nunzio, Giorgio Maria
    Sordoni, Alessandro
    [J]. SIGIR 2012: PROCEEDINGS OF THE 35TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2012, : 1002 - 1002
  • [2] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    [J]. CEIS 2011, 2011, 15
  • [3] Building an Ensemble of Fine-Tuned Naive Bayesian Classifiers for Text Classification
    El Hindi, Khalil
    AlSalman, Hussien
    Qasem, Safwan
    Al Ahmadi, Saad
    [J]. ENTROPY, 2018, 20 (11)
  • [4] A New Feature Selection Approach to Naive Bayes Text Classifiers
    Zhang, Lungan
    Jiang, Liangxiao
    Li, Chaoqun
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (02)
  • [5] Two feature weighting approaches for naive Bayes text classifiers
    Zhang, Lungan
    Jiang, Liangxiao
    Li, Chaoqun
    Kong, Ganggang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 100 : 137 - 144
  • [6] Naive Bayes text classifiers: a locally weighted learning approach
    Jiang, Liangxiao
    Cai, Zhihua
    Zhang, Harry
    Wang, Dianhong
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2013, 25 (02) : 273 - 286
  • [7] A New Instance-weighting Naive Bayes Text Classifiers
    Wu, Yongcheng
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE OF INTELLIGENT ROBOTICS AND CONTROL ENGINEERING (IRCE), 2018, : 198 - 202
  • [8] Combining active learning and boosting for Naive Bayes text classifiers
    Kim, HJ
    Kim, J
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 519 - 527
  • [9] Comparative analysis of the impact of discretization on the classification with Naive Bayes and semi-Naive Bayes classifiers
    Mizianty, Marcin
    Kurgan, Lukasz
    Ogiela, Marek
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 823 - +
  • [10] A Pairwise Naive Bayes Approach to Bayesian Classification
    Asafu-Adjei, Josephine K.
    Betensky, Rebecca A.
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (07)