Novel dynamic k-modes clustering of categorical and non categorical dataset with optimized genetic algorithm based feature selection

被引:1
|
作者
Suryanarayana, G. [1 ]
Prakash, L. N. C. K. [2 ]
Mahesh, P. C. Senthil [3 ]
Bhaskar, T. [4 ]
机构
[1] Vardhaman Coll Engn, Dept CSE, Hyderabad, Telangana, India
[2] CVR Coll Engn, Dept CSE, Hyderabad, Telangana, India
[3] Excel Engn Coll, Dept CSE, Namakkal, Tamil Nadu, India
[4] CMR Coll Engn & Technol, Dept CSE, Hyderabad, Telangana, India
关键词
Clustering; Genetic algorithm; K- modes clustering; Encircle; PSO;
D O I
10.1007/s11042-022-12126-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a technique that segregates a provided dataset into homogenous groups in accordance with the provided features. It aims to determine a structure in a group of unlabelled data. Cluster analysis is an unsupervised learning technology that determines the interesting patterns in data objects without class labels. K mode clustering algorithm seems to be effective in clustering categorical data due to its easy implementation and capability to handle the massive amount of data. But because of its random selectivity of initial centroids, it gives the local optimum solution. The main contribution of the paper is to evaluate the performance of clustering on the various dataset with the proposed system. The proposed method utilizes a genetic-based Metaheuristic encircle algorithm to select enriched features and novel dynamic K modes clustering based on Dimensionality Reduced PSO for clustering process with better computational time. The encircling Prey concept has been incorporated to choose the fitness function and overcome the genetic algorithm limitations in feature selection. This paper integrated the k-modes algorithm with particle swarm optimization algorithm to obtain a global optimum solution and update the initial centroid. Several dataset utilized for the evaluation of the proposed work has been found to achieve low accuracy in the previous work. But the proposed approach's effectiveness has been proved to be better by performing a comparative analysis with the state of art methods in terms of performance metrics such as F1 score, accuracy, NMI.
引用
收藏
页码:24399 / 24418
页数:20
相关论文
共 50 条
  • [1] Novel dynamic k-modes clustering of categorical and non categorical dataset with optimized genetic algorithm based feature selection
    G. Suryanarayana
    LNC Prakash K
    P. C. Senthil Mahesh
    T. Bhaskar
    [J]. Multimedia Tools and Applications, 2022, 81 : 24399 - 24418
  • [2] A genetic k-modes algorithm for clustering categorical data
    Gan, GJ
    Yang, ZJ
    Wu, JH
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 195 - 202
  • [3] A genetic fuzzy k-Modes algorithm for clustering categorical data
    Gan, G.
    Wu, J.
    Yang, Z.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1615 - 1620
  • [4] A fuzzy k-modes algorithm for clustering categorical data
    Huang, ZX
    Ng, MK
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452
  • [5] A Global K-modes Algorithm for Clustering Categorical Data
    Bai Tian
    Kulikowski, C. A.
    Gong Leiguang
    Yang Bin
    Huang Lan
    Zhou Chunguang
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2012, 21 (03) : 460 - 465
  • [6] An efficient k-modes algorithm for clustering categorical datasets
    Dorman, Karin S.
    Maitra, Ranjan
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2022, 15 (01) : 83 - 97
  • [7] A weighting k-modes algorithm for subspace clustering of categorical data
    Cao, Fuyuan
    Liang, Jiye
    Li, Deyu
    Zhao, Xingwang
    [J]. NEUROCOMPUTING, 2013, 108 : 23 - 30
  • [8] Categorical fuzzy k-modes clustering with automated feature weight learning
    Saha, Arkajyoti
    Das, Swagatam
    [J]. NEUROCOMPUTING, 2015, 166 : 422 - 435
  • [9] Initialization of K-Modes Clustering for Categorical Data
    Li Tao-ying
    Chen Yan
    Jin Zhi-hong
    Li Ye
    [J]. 2013 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE AND ENGINEERING (ICMSE), 2013, : 107 - 112
  • [10] Clustering categorical data: Soft rounding k-modes
    Gavva, Surya Teja
    Karthik, C. S.
    Punna, Sharath
    [J]. INFORMATION AND COMPUTATION, 2024, 296