Multinomial naive Bayes for text categorization revisited

被引:0
|
作者
Kibriya, AM [1 ]
Frank, E [1 ]
Pfahringer, B [1 ]
Holmes, G [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents empirical results for several versions of the multinomial naive Bayes classifier on four text categorization problems, and a way of improving it using locally weighted learning. More specifically, it compares standard multinomial naive Bayes to the recently proposed transformed weight-normalized complement naive Bayes classifier (TWCNB) [1], and shows that some of the modifications included in TWCNB may not be necessary to achieve optimum performance on some datasets. However, it does show that TFIDF conversion and document length normalization are important. It also shows that support vector machines can, in fact, sometimes very significantly outperform both methods. Finally, it shows how the performance of multinomial naive Bayes can be improved using locally weighted learning. However, the overall conclusion of our paper is that support vector machines are still the method of choice if the aim is to maximize accuracy.
引用
收藏
页码:488 / 499
页数:12
相关论文
共 50 条
  • [1] Modifying Naive Bayes Classifier for Multinomial Text Classification
    Sharma, Neha
    Singh, Manoj
    [J]. 2016 INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2016,
  • [2] Boosting Naive Bayes Text Categorization by Using Cloud Model
    Wan, Jian
    He, Tingting
    Chen, Jinguang
    Dong, Jinling
    [J]. 2011 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL, AND SYSTEMS SCIENCES, AND ENGINEERING (CESSE 2011), 2011, : 165 - +
  • [3] Toward Optimal Feature Selection in Naive Bayes for Text Categorization
    Tang, Bo
    Kay, Steven
    He, Haibo
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (09) : 2508 - 2521
  • [4] Hierarchical Scheme for Assigning Components in Multinomial Naive Bayes Text Classifier
    Nghia Nguyen
    Yamada, Koichi
    Suzuki, Izumi
    Unehara, Muneyuki
    [J]. 2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 335 - 340
  • [5] A new term-weighting scheme for naive Bayes text categorization
    Mendoza, Marcelo
    [J]. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2012, 8 (01) : 55 - +
  • [6] Discrimination-based feature selection for multinomial naive Bayes text classification
    Zhu, Jingbo
    Wang, Huizhen
    Zhang, Xijuan
    [J]. COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 149 - +
  • [7] Text-based Language Identifier using Multinomial Naive Bayes Algorithm
    Rawat, Sunita
    Werulkar, Lakshita
    Jaywant, Sagarika
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 96 - 102
  • [8] Structure extended multinomial naive Bayes
    Jiang, Liangxiao
    Wang, Shasha
    Li, Chaoqun
    Zhang, Lungan
    [J]. INFORMATION SCIENCES, 2016, 329 : 346 - 356
  • [9] Personality Classification based on Facebook status text using Multinomial Naive Bayes method
    Artissa, Y. B. N. D.
    Asror, I
    Faraby, S. A.
    [J]. 2ND INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE, 2019, 1192
  • [10] Mixture of latent multinomial naive Bayes classifier
    Harzevili, Nima Shiri
    Alizadeh, Sasan H.
    [J]. APPLIED SOFT COMPUTING, 2018, 69 : 516 - 527