Applying Naive Bayes Classifier to Document Clustering

被引:1
|
作者
Ji, Jie [1 ]
Zhao, Qiangfu [1 ]
机构
[1] Univ Aizu, Syst Intelligence Lab, Ikki Machi, Aizu wakamatsu, Fukushima 9658580, Japan
关键词
document clustering; Naive Bayes Classifier; Iterative Bayes Clustering; k-means; comparative advantage;
D O I
10.20965/jaciii.2010.p0624
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document clustering partitions sets of unlabeled documents so that documents in clusters share common concepts. A Naive Bayes Classifier (BC) is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions. BC requires a small amount of training data to estimate parameters required for classification. Since training data must be labeled, we propose an Iterative Bayes Clustering (IBC) algorithm. To improve IBC performance, we propose combining IBC with Comparative Advantage-based (CA) initialization method. Experimental results show that our proposal improves performance significantly over classical clustering methods.
引用
收藏
页码:624 / 630
页数:7
相关论文
共 50 条
  • [1] An automatic document classifier system based on Naive Bayes Classifier and Ontology
    Chang, Yi-Hsing
    Huang, Hsiu-Yi
    [J]. PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 3144 - 3149
  • [2] A METHOD FOR DETECTING DOCUMENT ORIENTATION BY USING NAIVE BAYES CLASSIFIER
    Deng, Xue
    Guo, Jun
    Chen, Youguang
    Liu, Xiaoping
    [J]. 2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 429 - 432
  • [3] Naive Bayes text classifier
    Zhang, Haiyi
    Li, Di
    [J]. GRC: 2007 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, PROCEEDINGS, 2007, : 708 - 711
  • [4] Applying the naive Bayes classifier to HVAC energy prediction using hourly data
    Lin, Chang-Ming
    Lin, Sheng-Fuu
    Liu, Hsin-Yu
    Tseng, Ko-Ying
    [J]. MICROSYSTEM TECHNOLOGIES-MICRO-AND NANOSYSTEMS-INFORMATION STORAGE AND PROCESSING SYSTEMS, 2022, 28 (01): : 121 - 135
  • [5] An Integration of K-Means Clustering and Naive Bayes Classifier for Intrusion Detection
    Varuna, S.
    Natesan, P.
    [J]. 2015 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2015,
  • [6] Building Naive Bayes Document Classifier Using Word Clusters Based on Bootstrap Averaging
    Wang Yuanzhe
    Zhang Qiang
    Bai Liyuan
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON IT IN MEDICINE & EDUCATION, VOLS 1 AND 2, PROCEEDINGS, 2009, : 202 - +
  • [7] A FUZZY EXPONENTIAL NAIVE BAYES CLASSIFIER
    Moraes, R. M.
    Machado, L. S.
    [J]. UNCERTAINTY MODELLING IN KNOWLEDGE ENGINEERING AND DECISION MAKING, 2016, 10 : 207 - 212
  • [8] A Fuzzy Gamma Naive Bayes classifier
    de Moraes, Ronei Marcos
    de Melo Gomes Soares, Elaine Anita
    Machado, Liliane dos Santos
    [J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 691 - 699
  • [9] Learning an optimal naive Bayes classifier
    Martinez-Arroyo, Miriam
    Sucar, L. Enrique
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, : 1236 - +
  • [10] The naive Bayes classifier for functional data
    Zhang, Yi-Chen
    Sakhanenko, Lyudmila
    [J]. STATISTICS & PROBABILITY LETTERS, 2019, 152 : 137 - 146