Adapting naive Bayes tree for text classification

被引:0
|
作者
Shasha Wang
Liangxiao Jiang
Chaoqun Li
机构
[1] China University of Geosciences,Department of Computer Science
[2] China University of Geosciences,Department of Mathematics
来源
Knowledge and Information Systems | 2015年 / 44卷
关键词
Text classification; Multinomial naive Bayes; Multinomial naive Bayes tree; Multiclass learning;
D O I
暂无
中图分类号
学科分类号
摘要
Naive Bayes (NB) is one of the top 10 algorithms thanks to its simplicity, efficiency, and interpretability. To weaken its attribute independence assumption, naive Bayes tree (NBTree) has been proposed. NBTree is a hybrid algorithm, which deploys a naive Bayes classifier on each leaf node of the built decision tree and has demonstrated remarkable classification performance. When comes to text classification tasks, multinomial naive Bayes (MNB) has been a dominant modeling approach after the multi-variate Bernoulli model. Inspired by the success of NBTree, we propose a new algorithm called multinomial naive Bayes tree (MNBTree) by deploying a multinomial naive Bayes text classifier on each leaf node of the built decision tree. Different from NBTree, MNBTree builds a binary tree, in which the split attributes’ values are just divided into zero and nonzero. At the same time, MNBTree uses the information gain measure instead of the classification accuracy measure to build the tree for reducing the time consumption. To further scale up the classification performance of MNBTree, we propose its multiclass learning version called multiclass multinomial naive Bayes tree (MMNBTree) by applying the multiclass technique to MNBTree. The experimental results on a large number of widely used text classification benchmark datasets validate the effectiveness of our proposed algorithms: MNBTree and MMNBTree.
引用
收藏
页码:77 / 89
页数:12
相关论文
共 50 条
  • [21] Improved Naive Bayes with optimal correlation factor for text classification
    Chen, Jiangning
    Dai, Zhibo
    Duan, Juntao
    Matzinger, Heinrich
    Popescu, Ionel
    SN APPLIED SCIENCES, 2019, 1 (09):
  • [22] DEEP FEATURE WEIGHTING IN NAIVE BAYES FOR CHINESE TEXT CLASSIFICATION
    Jiang, Qiaowei
    Wang, Wen
    Han, Xu
    Zhang, Shasha
    Wang, Xinyan
    Wang, Cong
    PROCEEDINGS OF 2016 4TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS (IEEE CCIS 2016), 2016, : 160 - 164
  • [23] Feature subset selection using naive Bayes for text classification
    Feng, Guozhong
    Guo, Jianhua
    Jing, Bing-Yi
    Sun, Tieli
    PATTERN RECOGNITION LETTERS, 2015, 65 : 109 - 115
  • [24] Laplace Naive Bayes classifier in the classification of text in machine learning
    Kalcheva, Neli
    Nikolov, Nedyalko
    PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON BIOMEDICAL INNOVATIONS AND APPLICATIONS (BIA 2020), 2020, : 18 - 20
  • [25] DISCRIMINATIVELY WEIGHTED NAIVE BAYES AND ITS APPLICATION IN TEXT CLASSIFICATION
    Jiang, Liangxiao
    Wang, Dianghong
    Cai, Zhihua
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2012, 21 (01)
  • [26] Improved Naive Bayes with optimal correlation factor for text classification
    Jiangning Chen
    Zhibo Dai
    Juntao Duan
    Heinrich Matzinger
    Ionel Popescu
    SN Applied Sciences, 2019, 1
  • [27] Fast Text Classification with Naive Bayes Method on Apache Spark
    Ogul, Iskender Ulgen
    Ozcan, Caner
    Hakdagli, Ozlem
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [28] A Chinese text classification system based on Naive Bayes algorithm
    Cui, Wei
    2016 INTERNATIONAL CONFERENCE ON ELECTRONIC, INFORMATION AND COMPUTER ENGINEERING, 2016, 44
  • [29] Integrating associative rule-based classification with Naive Bayes for text classification
    Hadi, Wa'el
    Al-Radaideh, Qasem A.
    Alhawari, Samer
    APPLIED SOFT COMPUTING, 2018, 69 : 344 - 356
  • [30] Naive Bayes text classifier
    Zhang, Haiyi
    Li, Di
    GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 708 - 711