K-means based method for overlapping document clustering

被引:2
|
作者
Beltran, Beatriz [1 ]
Vilarino, Darnes [1 ]
Martinez-Trinidad, Jose Fco. [2 ]
Carrasco-Ochoa, J. A. [2 ]
Pinto, David [1 ]
机构
[1] Benemerita Univ Autonoma Puebla, Language & Knowledge Engn Lab, Puebla, Mexico
[2] Inst Nacl Astrofis Opt & Electr, Comp Sci, Puebla, Mexico
关键词
Clustering; overlapping clustering; document clustering; ALGORITHM; DENSITY;
D O I
10.3233/JIFS-179878
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Overlapping clustering algorithms have shown to be effective for clustering documents. However, the current overlapping document clustering algorithms produce a big number of clusters, which make them little useful for the user. Therefore, in this paper, we propose a k-means based method for overlapping document clustering, which allows to specify by the user the number of groups to be built. Our experiments with different corpora show that our proposal allows obtaining better results in terms of FBcubed than other recent works for overlapping document clustering reported in the literature.
引用
收藏
页码:2127 / 2135
页数:9
相关论文
共 50 条
  • [11] Towards effective document clustering:: A constrained K-means based approach
    Hu, Guobiao
    Zhou, Shuigeng
    Guan, Jihong
    Hu, Xiaohua
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (04) : 1397 - 1409
  • [12] Hypermetric k-Means Clustering for Content-Based Document Management
    Decherchi, Sergio
    Gastaldo, Paolo
    Redi, Judith
    Zunino, Rodolfo
    [J]. PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS CISIS 2008, 2009, 53 : 61 - 68
  • [13] A hierarchical document clustering environment based on the Induced Bisecting k-Means
    Archetti, F.
    Campanelli, P.
    Fersini, E.
    Messina, E.
    [J]. FLEXIBLE QUERY ANSWERING SYSTEMS, PROCEEDINGS, 2006, 4027 : 257 - 269
  • [14] MLK-means - A hybrid machine learning based k-means clustering algorithm for document clustering
    Perumal, Pitchandi
    Nedunchezhian, Raju
    [J]. International Journal of Computer Science Issues, 2012, 9 (5 5-2): : 164 - 173
  • [15] Efficient Sparse Spherical k-Means for Document Clustering
    Knittel, Johannes
    Koch, Steffen
    Ertl, Thomas
    [J]. PROCEEDINGS OF THE 21ST ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG '21), 2021,
  • [16] Improved Document Clustering using K-means Algorithm
    Bide, Pramod
    Shedge, Rajashree
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [17] Document Clustering - A Feasible Demonstration with K-means Algorithm
    Arif, Wajiha
    Mahoto, Naeem Ahmed
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON COMPUTING, MATHEMATICS AND ENGINEERING TECHNOLOGIES (ICOMET), 2019,
  • [18] An Improved Method for K-Means Clustering
    Cui, Xiaowei
    Wang, Fuxiang
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 756 - 759
  • [19] LPOCSIN With K-Means: An Overlapping Clustering Technique with Cluster Information
    Sarker, Partho Sarathi
    Showrov, Md. Imran Hossain
    [J]. 2018 3RD INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER, AND OPTIMIZATION TECHNIQUES (ICEECCOT - 2018), 2018, : 21 - 25
  • [20] An Analysis of Distributed Document Clustering Using MapReduce Based K-Means Algorithm
    Sardar T.H.
    Ansari Z.
    [J]. Ansari, Zahid (zahid_cs@pace.edu.in), 1600, Springer (101): : 641 - 650