Clustering categorical data sets using tabu search techniques

被引:62
|
作者
Ng, MK [1 ]
Wong, JC [1 ]
机构
[1] Univ Hong Kong, Dept Math, Hong Kong, Hong Kong, Peoples R China
关键词
clustering; k-means; k-modes; tabu search; numeric data; categorical data;
D O I
10.1016/S0031-3203(02)00021-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering methods partition a set of objects into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria. The fuzzy k-means-type algorithm is best suited for implementing this clustering operation because of its effectiveness in clustering data sets. However, working only on numeric values limits its use because data sets often contain categorical values. In this paper, we present a tabu search based clustering algorithm, to extend the k-means paradigm to categorical domains, and domains with both numeric and categorical values. Using tabu search based techniques, our algorithm can explore the solution space beyond local optimality in order to aim at finding a global solution of the fuzzy clustering problem. It is found that the clustering results produced by the proposed algorithm are very high in accuracy. (C) 2002 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:2783 / 2790
页数:8
相关论文
共 50 条
  • [1] Clustering problems using tabu search techniques
    Ng, MK
    RECENT DEVELOPMENT IN THEORIES & NUMERICS, 2003, : 364 - 373
  • [2] Clustering Categorical Data Using Community Detection Techniques
    Huu Hiep Nguyen
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2017, 2017
  • [3] Similarity search in sets and categorical data using the signature tree
    Mamoulis, N
    Cheung, DW
    Lian, W
    19TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2003, : 75 - 86
  • [4] A Data Labeling method for Categorical Data Clustering using Cluster Entropies in Rough Sets
    Reddy, H. Venkateswara
    Kumar, B. Suresh
    Raju, S. Viswanadha
    2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 444 - 449
  • [5] Categorical Data Clustering Using Harmony Search Algorithm for Healthcare Datasets
    Sharma, Abha
    Kumar, Pushpendra
    Babulal, Kanojia Sindhuben
    Obaid, Ahmed J.
    Patel, Harshita
    INTERNATIONAL JOURNAL OF E-HEALTH AND MEDICAL COMMUNICATIONS, 2022, 13 (04)
  • [6] Improved Fuzzy Clustering Techniques for Categorical Data
    Saha, Indrajit
    Maulik, Ujjwal
    IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOL 1, 2009, 1089 : 82 - +
  • [7] A parallel tabu search heuristic for clustering data set
    Ng, M
    2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, PROCEEDINGS, 2003, : 230 - 235
  • [8] Clustering Mixed Numeric and Categorical Data With Cuckoo Search
    Ji, Jinchao
    Pang, Wei
    Li, Zairong
    He, Fei
    Feng, Guozhong
    Zhao, Xiaowei
    IEEE ACCESS, 2020, 8 : 30988 - 31003
  • [9] Eigenvector Selection in Spectral Clustering using Tabu Search
    Toussi, Soheila Ashkezari
    Yazdi, Hadi Sadoghi
    Hajinezhad, Ensie
    Effati, Sohrab
    2011 1ST INTERNATIONAL ECONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2011, : 75 - 80
  • [10] A multicluster approach to selecting initial sets for clustering of categorical data
    Santos-Mangudo C.
    Heras A.J.
    Santos-Mangudo, Carlos (casant01@ucm.es), 2020, Informing Science Institute (15) : 227 - 246