Multinomial naive Bayes for text categorization revisited

被引:0
|
作者
Kibriya, AM [1 ]
Frank, E [1 ]
Pfahringer, B [1 ]
Holmes, G [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents empirical results for several versions of the multinomial naive Bayes classifier on four text categorization problems, and a way of improving it using locally weighted learning. More specifically, it compares standard multinomial naive Bayes to the recently proposed transformed weight-normalized complement naive Bayes classifier (TWCNB) [1], and shows that some of the modifications included in TWCNB may not be necessary to achieve optimum performance on some datasets. However, it does show that TFIDF conversion and document length normalization are important. It also shows that support vector machines can, in fact, sometimes very significantly outperform both methods. Finally, it shows how the performance of multinomial naive Bayes can be improved using locally weighted learning. However, the overall conclusion of our paper is that support vector machines are still the method of choice if the aim is to maximize accuracy.
引用
收藏
页码:488 / 499
页数:12
相关论文
共 50 条
  • [31] Multinomial Naive Bayes Classifier for Sentiment Analysis of Internet Movie Database
    Dewi, Christine
    Chen, Rung-Ching
    Christanto, Henoch Juli
    Cauteruccio, Francesco
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2023, 10 (04) : 485 - 498
  • [32] A message classifier based on multinomial Naive Bayes for online social contexts
    de Souza Viana, Tharsis Salathiel
    de Oliveira, Marcos
    Coelho da Silva, Ticiana Linhares
    Rodrigues Falc Ao, Mario Sergio
    Tavares Goncalves, Enyo Jose
    [J]. JOURNAL OF MANAGEMENT ANALYTICS, 2018, 5 (03) : 213 - 229
  • [33] Word Embedding based Multinomial Naive Bayes Algorithm for Spam Filtering
    Kadam, Sumedh
    Gala, Aayush
    Gehlot, Pritesh
    Kurup, Aditya
    Ghag, Kranti
    [J]. 2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [34] Sentiment analysis on hotel reviews using Multinomial Naive Bayes classifier
    Farisi, Arif Abdurrahman
    Sibaroni, Yuliant
    Al Faraby, Said
    [J]. 2ND INTERNATIONAL CONFERENCE ON DATA AND INFORMATION SCIENCE, 2019, 1192
  • [35] Muscle Categorization Using PDF Estimation and Naive Bayes Classification
    Adel, Tameem M.
    Smith, Benn E.
    Stashuk, Daniel W.
    [J]. 2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 2619 - 2622
  • [36] Automatic Web Pages Categorization with ReliefF and Hidden Naive Bayes
    Jin, Xin
    Li, Rongyan
    Shen, Xian
    Bie, Rongfang
    [J]. APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 617 - 621
  • [37] A Technique for Improving the Performance of Naive Bayes Text Classification
    Jiang, Yuqian
    Lin, Huaizhong
    Wang, Xuesong
    Lu, Dongming
    [J]. WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 196 - 203
  • [38] An improved FloatBoost algorithm for Naive Bayes text classification
    Liu, XM
    Yin, JW
    Dong, JX
    Ghafoor, MA
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 162 - 171
  • [39] Some effective techniques for naive Bayes text classification
    Kim, Sang-Bum
    Han, Kyoung-Soo
    Rim, Hae-Chang
    Myaeng, Sung Hyon
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (11) : 1457 - 1466
  • [40] Estimating a one -class naive Bayes text classifier
    Zhang, Yihong
    Jatowt, Adam
    [J]. INTELLIGENT DATA ANALYSIS, 2020, 24 (03) : 567 - 579