Building Naive Bayes Document Classifier Using Word Clusters Based on Bootstrap Averaging

被引:0
|
作者
Wang Yuanzhe [1 ,2 ]
Zhang Qiang [1 ,2 ]
Bai Liyuan [2 ]
机构
[1] Wuhan Univ Technol, Inst Informat Engn, Wuhan 430070, Peoples R China
[2] Henan Univ Technol, Zhengzhou 450052, Peoples R China
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Aimed to solve the problem of low classification accuracy caused by poor distribution estimation by training naive bayes document classfier on word clusters, we build a sequential word list based on mutual information between words and their semantic cluster labels, then construct a sample set of the same size with the word list through bootstrap sampling and use the average of the corresponding parameters estimated from the sample set as the last parameter to classify unknown documents. Experiment results on benchmark document data sets show that the proposed strategy gains higher classification accuracy comparing to naive bayes documents classifier on word clusters or on words.
引用
收藏
页码:202 / +
页数:2
相关论文
共 50 条
  • [1] An automatic document classifier system based on Naive Bayes Classifier and Ontology
    Chang, Yi-Hsing
    Huang, Hsiu-Yi
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 3144 - 3149
  • [2] A NAIVE BAYES CLASSIFIER FOR WEB DOCUMENT SUMMARIES CREATED BY USING WORD SIMILARITY AND SIGNIFICANT FACTORS
    Pera, Maria Soledad
    Ng, Yiu-Kai
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2010, 19 (04) : 465 - 486
  • [3] Regularization and averaging of the selective Naive Bayes classifier
    Boulle, Marc
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1680 - 1688
  • [4] A METHOD FOR DETECTING DOCUMENT ORIENTATION BY USING NAIVE BAYES CLASSIFIER
    Deng, Xue
    Guo, Jun
    Chen, Youguang
    Liu, Xiaoping
    [J]. 2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 429 - 432
  • [5] Malayalam Word Sense Disambiguation using Naive Bayes Classifier
    Gopal, Sreelakshmi
    Haroon, Rosna P.
    [J]. 2016 INTERNATIONAL CONFERENCE ON ADVANCES IN HUMAN MACHINE INTERACTION (HMI), 2016, : 83 - 86
  • [6] A Distributed Chinese Naive Bayes Classifier Based on Word Embedding
    Feng, Mengke
    Wu, Guoshi
    [J]. PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND COMPUTING TECHNOLOGY, 2016, 60 : 1121 - 1127
  • [7] Applying Naive Bayes Classifier to Document Clustering
    Ji, Jie
    Zhao, Qiangfu
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2010, 14 (06) : 624 - 630
  • [8] A Word-Based Naive Bayes Classifier for Confidence Estimation in Speech Recognition
    Sanchis, Alberto
    Juan, Alfons
    Vidal, Enrique
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 565 - 574
  • [9] NAIVE BAYES CLASSIFIER FOR WORD SENSE DISAMBIGUATION OF PUNJABI LANGUAGE
    Singh, Varinder Pal
    Kumar, Parteek
    [J]. MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2018, 31 (03) : 188 - 199
  • [10] Opinion Based Book Recommendation Using Naive Bayes Classifier
    Tewari, Anand Shanker
    Ansari, Tasif Sultan
    Barman, Asim Gopal
    [J]. 2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 139 - 144