Novel dynamic k-modes clustering of categorical and non categorical dataset with optimized genetic algorithm based feature selection

被引:1
|
作者
Suryanarayana, G. [1 ]
Prakash, L. N. C. K. [2 ]
Mahesh, P. C. Senthil [3 ]
Bhaskar, T. [4 ]
机构
[1] Vardhaman Coll Engn, Dept CSE, Hyderabad, Telangana, India
[2] CVR Coll Engn, Dept CSE, Hyderabad, Telangana, India
[3] Excel Engn Coll, Dept CSE, Namakkal, Tamil Nadu, India
[4] CMR Coll Engn & Technol, Dept CSE, Hyderabad, Telangana, India
关键词
Clustering; Genetic algorithm; K- modes clustering; Encircle; PSO;
D O I
10.1007/s11042-022-12126-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a technique that segregates a provided dataset into homogenous groups in accordance with the provided features. It aims to determine a structure in a group of unlabelled data. Cluster analysis is an unsupervised learning technology that determines the interesting patterns in data objects without class labels. K mode clustering algorithm seems to be effective in clustering categorical data due to its easy implementation and capability to handle the massive amount of data. But because of its random selectivity of initial centroids, it gives the local optimum solution. The main contribution of the paper is to evaluate the performance of clustering on the various dataset with the proposed system. The proposed method utilizes a genetic-based Metaheuristic encircle algorithm to select enriched features and novel dynamic K modes clustering based on Dimensionality Reduced PSO for clustering process with better computational time. The encircling Prey concept has been incorporated to choose the fitness function and overcome the genetic algorithm limitations in feature selection. This paper integrated the k-modes algorithm with particle swarm optimization algorithm to obtain a global optimum solution and update the initial centroid. Several dataset utilized for the evaluation of the proposed work has been found to achieve low accuracy in the previous work. But the proposed approach's effectiveness has been proved to be better by performing a comparative analysis with the state of art methods in terms of performance metrics such as F1 score, accuracy, NMI.
引用
收藏
页码:24399 / 24418
页数:20
相关论文
共 50 条
  • [21] A rough set based algorithm for updating the modes in categorical clustering
    Semeh Ben Salem
    Sami Naouali
    Zied Chtourou
    [J]. International Journal of Machine Learning and Cybernetics, 2021, 12 : 2069 - 2090
  • [22] A Novel Consensus Fuzzy K-Modes Clustering Using Coupling DNA-Chain-Hypergraph P System for Categorical Data
    Jiang, Zhenni
    Liu, Xiyu
    [J]. PROCESSES, 2020, 8 (10) : 1 - 17
  • [23] A Genetic Algorithm Based Ensemble Approach for Categorical Data Clustering
    Goswami, Jyoti Prokash
    Mahanta, Anjana Kakoti
    [J]. 2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [24] K-Modes clustering algorithm based on a new distance measure
    Liang, Jiye
    Bai, Liang
    Cao, Fuyuan
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (10): : 1749 - 1755
  • [25] A MD fuzzy k-modes Algorithm for Clustering Categorical Matrix-Object Data; [基于分类型矩阵对象数据的MD fuzzy k-modes聚类算法]
    Li S.
    Zhang M.
    Cao F.
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (06): : 1325 - 1337
  • [26] FKMAWCW: Categorical fuzzy k-modes clustering with automated attribute-weight and cluster-weight learning
    Oskouei, Amin Golzari
    Balafar, Mohammad Ali
    Motamed, Cina
    [J]. CHAOS SOLITONS & FRACTALS, 2021, 153
  • [27] k-mw-modes: An algorithm for clustering categorical matrix-object data
    Cao, Fuyuan
    Yu, Liqin
    Huang, Joshua Zhexue
    Liang, Jiye
    [J]. APPLIED SOFT COMPUTING, 2017, 57 : 605 - 614
  • [28] Genetic Algorithm and Simulated Annealing based Approaches to Categorical Data Clustering
    Saha, Indrajit
    Mukhopadhyay, Anirban
    [J]. IEEE REGION 10 COLLOQUIUM AND THIRD INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS, VOLS 1 AND 2, 2008, : 18 - +
  • [29] Multiobjective Genetic Algorithm-Based Fuzzy Clustering of Categorical Attributes
    Mukhopadhyay, Anirban
    Maulik, Ujjwal
    Bandyopadhyay, Sanghamitra
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2009, 13 (05) : 991 - 1005
  • [30] Genetic algorithm and simulated annealing based approaches to categorical data clustering
    Saha, Indrajit
    Mukhopadhyay, Anirban
    [J]. IMECS 2008: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2008, : 534 - +