Applying Naive Bayes Classifier to Document Clustering

被引:1
|
作者
Ji, Jie [1 ]
Zhao, Qiangfu [1 ]
机构
[1] Univ Aizu, Syst Intelligence Lab, Ikki Machi, Aizu wakamatsu, Fukushima 9658580, Japan
关键词
document clustering; Naive Bayes Classifier; Iterative Bayes Clustering; k-means; comparative advantage;
D O I
10.20965/jaciii.2010.p0624
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Document clustering partitions sets of unlabeled documents so that documents in clusters share common concepts. A Naive Bayes Classifier (BC) is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions. BC requires a small amount of training data to estimate parameters required for classification. Since training data must be labeled, we propose an Iterative Bayes Clustering (IBC) algorithm. To improve IBC performance, we propose combining IBC with Comparative Advantage-based (CA) initialization method. Experimental results show that our proposal improves performance significantly over classical clustering methods.
引用
收藏
页码:624 / 630
页数:7
相关论文
共 50 条
  • [41] A Finite Sample Analysis of the Naive Bayes Classifier
    Berend, Daniel
    Kontorovich, Aryeh
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 1519 - 1545
  • [42] Mining housekeeping genes with a Naive Bayes classifier
    De Ferrari, Luna
    Aitken, Stuart
    [J]. BMC GENOMICS, 2006, 7 (1)
  • [43] DECOMPOSABLE NAIVE BAYES CLASSIFIER FOR PARTITIONED DATA
    Khedr, Ahmed M.
    [J]. COMPUTING AND INFORMATICS, 2012, 31 (06) : 1511 - 1531
  • [44] Mixture of latent multinomial naive Bayes classifier
    Harzevili, Nima Shiri
    Alizadeh, Sasan H.
    [J]. APPLIED SOFT COMPUTING, 2018, 69 : 516 - 527
  • [45] Extended Naive Bayes classifier for mixed data
    Hsu, Chung-Chian
    Huang, Yan-Ping
    Chang, Keng-Wei
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (03) : 1080 - 1083
  • [46] Alpha Skew Gaussian Naive Bayes Classifier
    Ara, Anderson
    Louzada, Francisco
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2022, 21 (01) : 441 - 462
  • [47] The Application of Naive Bayes Classifier in Name Disambiguation
    Li, Na
    Han, Jin
    [J]. CLOUD COMPUTING AND SECURITY, PT II, 2017, 10603 : 611 - 618
  • [48] Prediction of Conotoxin Superfamilies by the Naive Bayes Classifier
    Huo, Haiyan
    Yang, Lei
    [J]. 2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [49] Semi-hierarchical naive Bayes classifier
    Njah, Hasna
    Jamoussi, Salma
    Mahdi, Walid
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1772 - 1779
  • [50] Prediction of conotoxin superfamilies by the Naive Bayes classifier
    Huo, Haiyan
    Yang, Lei
    [J]. Proceedings - 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, CISP-BMEI 2017, 2017, 2018-January : 1 - 5