Particle swarm optimizer for variable weighting in clustering high-dimensional data

被引:58
|
作者
Lu, Yanping [1 ,2 ]
Wang, Shengrui [1 ]
Li, Shaozi [2 ]
Zhou, Changle [2 ]
机构
[1] Univ Sherbrooke, Dept Comp Sci, Sherbrooke, PQ J1K 2R1, Canada
[2] Xiamen Univ, Dept Cognit Sci, Xiamen 361005, Peoples R China
关键词
High-dimensional data; Projected clustering; Variable weighting; Particle swarm optimization; Text clustering; ALGORITHM;
D O I
10.1007/s10994-009-5154-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a particle swarm optimizer (PSO) to solve the variable weighting problem in projected clustering of high-dimensional data. Many subspace clustering algorithms fail to yield good cluster quality because they do not employ an efficient search strategy. In this paper, we are interested in soft projected clustering. We design a suitable k-means objective weighting function, in which a change of variable weights is exponentially reflected. We also transform the original constrained variable weighting problem into a problem with bound constraints, using a normalized representation of variable weights, and we utilize a particle swarm optimizer to minimize the objective function in order to search for global optima to the variable weighting problem in clustering. Our experimental results on both synthetic and real data show that the proposed algorithm greatly improves cluster quality. In addition, the results of the new algorithm are much less dependent on the initial cluster centroids. In an application to text clustering, we show that the algorithm can be easily adapted to other similarity measures, such as the extended Jaccard coefficient for text data, and can be very effective.
引用
收藏
页码:43 / 70
页数:28
相关论文
共 50 条
  • [1] Particle swarm optimizer for variable weighting in clustering high-dimensional data
    Yanping Lu
    Shengrui Wang
    Shaozi Li
    Changle Zhou
    [J]. Machine Learning, 2011, 82 : 43 - 70
  • [2] Particle Swarm Optimizer for Variable Weighting in Clustering High-dimensional Data
    Lu, Yanping
    Wang, Shengrui
    Li, Shaozi
    Zhou, Changle
    [J]. 2009 IEEE SWARM INTELLIGENCE SYMPOSIUM, 2009, : 37 - +
  • [3] Particle Swarm Optimisation for Feature Selection and Weighting in High-Dimensional Clustering
    O'Neill, Damien
    Lensen, Andrew
    Xue, Bing
    Zhang, Mengjie
    [J]. 2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 173 - 180
  • [4] Finding relevant clustering directions in high-dimensional data using Particle Swarm Optimization
    Marini, Federico
    Walczak, Beata
    [J]. JOURNAL OF CHEMOMETRICS, 2011, 25 (07) : 366 - 374
  • [5] A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data
    Esmin, Ahmed A. A.
    Coelho, Rodrigo A.
    Matwin, Stan
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2015, 44 (01) : 23 - 45
  • [6] A review on particle swarm optimization algorithm and its variants to clustering high-dimensional data
    Ahmed A. A. Esmin
    Rodrigo A. Coelho
    Stan Matwin
    [J]. Artificial Intelligence Review, 2015, 44 : 23 - 45
  • [7] Fuzzy Clustering High-Dimensional Data Using Information Weighting
    Bodyanskiy, Yevgeniy V.
    Tyshchenko, Oleksii K.
    Mashtalir, Sergii V.
    [J]. ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 : 385 - 395
  • [8] Bayesian variable selection in clustering high-dimensional data
    Tadesse, MG
    Sha, N
    Vannucci, M
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (470) : 602 - 617
  • [9] A feature group weighting method for subspace clustering of high-dimensional data
    Chen, Xiaojun
    Ye, Yunming
    Xu, Xiaofei
    Huang, Joshua Zhexue
    [J]. PATTERN RECOGNITION, 2012, 45 (01) : 434 - 446
  • [10] An entropy weighting mixture model for subspace clustering of high-dimensional data
    Peng, Liuqing
    Zhang, Junying
    [J]. PATTERN RECOGNITION LETTERS, 2011, 32 (08) : 1154 - 1161