A text clustering algorithm based on find of density peaks

被引:3
|
作者
Liu, Peiyu [1 ]
Liu, Yingying [2 ]
Hou, Xiuyan [2 ]
Li, Qingqing [2 ]
Zhu, Zhenfang [3 ]
机构
[1] Shandong Yingcai Univ, Jinan, Peoples R China
[2] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan, Peoples R China
[3] Shandong Jiaotong Univ, Sch Informat Sci & Elect Engn, Jinan, Peoples R China
关键词
Density; Text clustering; Feature term; Vector distance; Similarity;
D O I
10.1109/ITME.2015.103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The text clustering is one of core issues in the field of text mining and information retrieval. The clustering algorithm is divided into four categories: the partitioned clustering algorithm, the hierarchical clustering algorithm, density-based clustering algorithm, as well as intelligence clustering algorithm, but at present, many of which cannot meet the demand of speed and self-adapting about text clustering. Therefore this paper proposed a text clustering algorithm based on find of density peaks. The algorithm was implemented by the calculation of text distance and density, which was in accordance with calculation of the text vector similarity. SVM was used to express text to obtain the vector mapping for the similarity calculation. The next work was the finding of the local density and the distance from points of higher density of each text, removing the noise points, selecting the cluster center. The remaining points were assigned into the cluster which its nearest cluster center represented. According to several sets of contrast experiment, the density-based text clustering has an advantage of reliability and robustness.
引用
下载
收藏
页码:348 / 352
页数:5
相关论文
共 50 条
  • [1] A clustering algorithm for fuzzy numbers based on fast search and find of density peaks
    Li, Ye
    Chen, Yiyan
    Li, Qun
    INTELLIGENT DATA ANALYSIS, 2019, 23 : S25 - S52
  • [2] ICFS: An Improved Fast Search and Find of Density Peaks Clustering Algorithm
    Gao, Jing
    Zhao, Liang
    Chen, Zhikui
    Li, Peng
    Xu, Han
    Hu, Yueming
    2016 IEEE 14TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 14TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 2ND INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/DATACOM/CYBERSC, 2016, : 537 - 543
  • [3] Clustering by fast search and find of density peaks
    Rodriguez, Alex
    Laio, Alessandro
    SCIENCE, 2014, 344 (6191) : 1492 - 1496
  • [4] Paralleled fast search and find of density peaks clustering algorithm on GPUs with CUDA
    Li M.
    Huang J.
    Wang J.
    International Journal of Networked and Distributed Computing, 2016, 4 (3) : 173 - 181
  • [5] Paralleled Fast Search and Find of Density Peaks Clustering Algorithm on GPUs with CUDA
    Li, Mi
    Huang, Jie
    Wang, Jingpeng
    2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 313 - 318
  • [6] A fuzzy mixed data clustering algorithm by fast search and find of density peaks
    Li, Ye
    Chen, Yiyan
    Li, Qun
    INTELLIGENT DATA ANALYSIS, 2019, 23 : S199 - S224
  • [7] An Improved Density Peaks Clustering Algorithm Based On Density Ratio
    Zou, Yujuan
    Wang, Zhijian
    Xu, Pengfei
    Lv, Taizhi
    COMPUTER JOURNAL, 2024, 67 (07): : 2515 - 2528
  • [8] Constraint-based clustering by fast search and find of density peaks
    Liu, Ruhui
    Huang, Weiping
    Fei, Zhengshun
    Wang, Kai
    Liang, Jun
    NEUROCOMPUTING, 2019, 330 : 223 - 237
  • [9] Sparse learning based on clustering by fast search and find of density peaks
    Li, Pengqing
    Deng, Xuelian
    Zhang, Leyuan
    Gan, Jiangzhang
    Li, Jiaye
    Li, Yonggang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (23) : 33261 - 33277
  • [10] Sparse learning based on clustering by fast search and find of density peaks
    Pengqing Li
    Xuelian Deng
    Leyuan Zhang
    Jiangzhang Gan
    Jiaye Li
    Yonggang Li
    Multimedia Tools and Applications, 2019, 78 : 33261 - 33277