A k-means type clustering algorithm for subspace clustering of mixed numeric and categorical datasets

被引:43
|
作者
Ahmad, Amir [1 ]
Dey, Lipika [2 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Rabigh, Saudi Arabia
[2] Tata Consultancy Serv, Innovat Labs, New Delhi, India
关键词
Clustering; Subspace clustering; Mixed data; Categorical data;
D O I
10.1016/j.patrec.2011.02.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Almost all subspace clustering algorithms proposed so far are designed for numeric datasets. In this paper, we present a k-means type clustering algorithm that finds clusters in data subspaces in mixed numeric and categorical datasets. In this method, we compute attributes contribution to different clusters. We propose a new cost function for a k-means type algorithm. One of the advantages of this algorithm is its complexity which is linear with respect to the number of the data points. This algorithm is also useful in describing the cluster formation in terms of attributes contribution to different clusters. The algorithm is tested on various synthetic and real datasets to show its effectiveness. The clustering results are explained by using attributes weights in the clusters. The clustering results are also compared with published results. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:1062 / 1069
页数:8
相关论文
共 50 条
  • [31] A k-means based clustering algorithm
    Bloisi, Domenico Daniele
    Locchi, Luca
    [J]. COMPUTER VISION SYSTEMS, PROCEEDINGS, 2008, 5008 : 109 - 118
  • [32] Extensions to the k-means algorithm for clustering large data sets with categorical values
    Huang, ZX
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) : 283 - 304
  • [33] An improved K-means clustering algorithm
    Huang, Xiuchang
    Su, Wei
    [J]. Journal of Networks, 2014, 9 (01) : 161 - 167
  • [34] Adaptive K-Means clustering algorithm
    Chen, Hailin
    Wu, Xiuqing
    Hu, Junhua
    [J]. MIPPR 2007: PATTERN RECOGNITION AND COMPUTER VISION, 2007, 6788
  • [35] Improved Algorithm for the k-means Clustering
    Zhang, Sheng
    Wang, Shouqiang
    [J]. PROCEEDINGS OF THE 10TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2012), 2012, : 4717 - 4720
  • [36] An Enhancement of K-means Clustering Algorithm
    Gu, Jirong
    Zhou, Jieming
    Chen, Xianwei
    [J]. 2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 237 - 240
  • [37] Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values
    Zhexue Huang
    [J]. Data Mining and Knowledge Discovery, 1998, 2 : 283 - 304
  • [38] Subspace clustering by directly solving Discriminative K-means
    Gao, Chenhui
    Chen, Wenzhi
    Nie, Feiping
    Yu, Weizhong
    Yan, Feihu
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 252
  • [39] K-means properties on six clustering benchmark datasets
    Pasi Fränti
    Sami Sieranoja
    [J]. Applied Intelligence, 2018, 48 : 4743 - 4759
  • [40] K-means properties on six clustering benchmark datasets
    Franti, Pasi
    Sieranoja, Sami
    [J]. APPLIED INTELLIGENCE, 2018, 48 (12) : 4743 - 4759