Adapting naive Bayes tree for text classification

被引:80
|
作者
Wang, Shasha [1 ]
Jiang, Liangxiao [1 ]
Li, Chaoqun [2 ]
机构
[1] China Univ Geosci, Dept Comp Sci, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Dept Math, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Text classification; Multinomial naive Bayes; Multinomial naive Bayes tree; Multiclass learning; CLASSIFIERS; ALGORITHMS;
D O I
10.1007/s10115-014-0746-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Naive Bayes (NB) is one of the top 10 algorithms thanks to its simplicity, efficiency, and interpretability. To weaken its attribute independence assumption, naive Bayes tree (NBTree) has been proposed. NBTree is a hybrid algorithm, which deploys a naive Bayes classifier on each leaf node of the built decision tree and has demonstrated remarkable classification performance. When comes to text classification tasks, multinomial naive Bayes (MNB) has been a dominant modeling approach after the multi-variate Bernoulli model. Inspired by the success of NBTree, we propose a new algorithm called multinomial naive Bayes tree (MNBTree) by deploying a multinomial naive Bayes text classifier on each leaf node of the built decision tree. Different from NBTree, MNBTree builds a binary tree, in which the split attributes' values are just divided into zero and nonzero. At the same time, MNBTree uses the information gain measure instead of the classification accuracy measure to build the tree for reducing the time consumption. To further scale up the classification performance of MNBTree, we propose its multiclass learning version called multiclass multinomial naive Bayes tree (MMNBTree) by applying the multiclass technique to MNBTree. The experimental results on a large number of widely used text classification benchmark datasets validate the effectiveness of our proposed algorithms: MNBTree and MMNBTree.
引用
收藏
页码:77 / 89
页数:13
相关论文
共 50 条
  • [1] Adapting naive Bayes tree for text classification
    Shasha Wang
    Liangxiao Jiang
    Chaoqun Li
    [J]. Knowledge and Information Systems, 2015, 44 : 77 - 89
  • [2] Adapting Hidden Naive Bayes for Text Classification
    Gan, Shengfeng
    Shao, Shiqi
    Chen, Long
    Yu, Liangjun
    Jiang, Liangxiao
    [J]. MATHEMATICS, 2021, 9 (19)
  • [3] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    [J]. CEIS 2011, 2011, 15
  • [4] Bayesian Naive Bayes classifiers to text classification
    Xu, Shuo
    [J]. JOURNAL OF INFORMATION SCIENCE, 2018, 44 (01) : 48 - 59
  • [5] Feature selection for text classification with Naive Bayes
    Chen, Jingnian
    Huang, Houkuan
    Tian, Shengfeng
    Qu, Youli
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
  • [6] Naive Bayes for text classification with unbalanced classes
    Frank, Eibe
    Bouckaert, Remco R.
    [J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2006, PROCEEDINGS, 2006, 4213 : 503 - 510
  • [7] Combining decision tree and Naive Bayes for classification
    Wang, Li-Min
    Li, Xiao-n Li
    Cao, Chun-Hong
    Yuan, Sen-Miao
    [J]. KNOWLEDGE-BASED SYSTEMS, 2006, 19 (07) : 511 - 515
  • [8] A Naive Bayes Based Tree Classification System
    Chen, Ying Yu
    Lu, Xun
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT, COMMERCE AND SOCIETY, 2015, 17 : 699 - 702
  • [9] Adapting Naive Bayes Model for Text Classification with One-of and Imbalanced Multi-Class Problems
    Almaleh, Ahood
    Aslam, Muhammad Ahtisham
    Saeedi, Kawther
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2020, 20 (09): : 84 - 90
  • [10] A Technique for Improving the Performance of Naive Bayes Text Classification
    Jiang, Yuqian
    Lin, Huaizhong
    Wang, Xuesong
    Lu, Dongming
    [J]. WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 196 - 203