Automatic clustering based on Crow Search Algorithm-Kmeans (CSA-Kmeans) and Data Envelopment Analysis (DEA)

被引:9
|
作者
Balavand, Alireza [1 ]
Kashan, Ali Husseinzadeh [2 ]
Saghaei, Abbas [1 ]
机构
[1] Islamic Azad Univ, Sci & Res Branch, Dept Ind Engn, Tehran, Iran
[2] Tarbiat Modares Univ, Fac Ind & Syst Engn, Tehran, Iran
关键词
Automatic Clustering; K-Means; Crow Search Optimization Algorithm; Cluster Validity Indices; Data Envelopment Analysis; OPTIMIZATION ALGORITHM; VALIDITY MEASURE; EVOLUTION; COLONY;
D O I
10.2991/ijcis.11.1.98
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cluster Validity Indices (CVI) evaluate the efficiency of a clustering algorithm and Data Envelopment Analysis (DEA) evaluate the efficiency of Decision-Making Units (DMUs) using a number of inputs data and outputs data. Combination of the CVI and DEA inspired the development of a new automatic clustering algorithm called Automatic Clustering Based on Data Envelopment Analysis (ACDEA). ACDEA is able to determine the optimal number of clusters in four main steps. In the first step, a new clustering algorithm called CSA-Kmeans is introduced. In this algorithm, clustering is performed by the Crow Search Algorithm (CSA), in which the K-means algorithm generates the initial centers of the clusters. In the second step, the clustering of data is performed from k(min) cluster to k(max) cluster, using CSA-Kmeans. At each iteration of clustering, using correct data labels, Within-Group Scatter (WGS) index, Between-Group Scatter (BGS) index, Dunn Index (DI), the Calinski-Harabasz (CH) index, and the Silhouette index (SI) are extracted and stored, which ultimately these indices make a matrix that the columns of this matrix indicate the values of validity indices and the rows or DMUs represent the number of clustering times from k(min) cluster to k(max) cluster. In the third step, the efficiency of the DMUs is calculated using the DEA method based on the second stage matrix, and given that the DI, CH, and SI estimate the relationship within group scatter and between group scatter, WGS and BGS are used as input variables and the indices of DI, CH and SI are used as output variables to DEA. Finally, in step four, AP method is used to calculate the efficiency of DMUs, so that an efficiency value is obtained for each DMU that maximum efficiency represents the optimal number of clusters. In this study, three categories of data are used to measure the efficiency of the ACDEA algorithm. Also, the efficiency of ACDEA is compared with the DCPSO, GCUK and ACDE algorithms. According to the results, there is a positive significant relationship between input CVI and output CVI in data envelopment analysis, and the optimal number of clusters is achieved for many cases.
引用
收藏
页码:1322 / 1337
页数:16
相关论文
共 14 条
  • [1] Automatic clustering based on Crow Search Algorithm-Kmeans (CSA-Kmeans) and Data Envelopment Analysis (DEA)
    Alireza Balavand
    Ali Husseinzadeh Kashan
    Abbas Saghaei
    [J]. International Journal of Computational Intelligence Systems, 2018, 11 : 1322 - 1337
  • [2] The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
    Liu, Xin
    Sang, Xuefeng
    Chang, Jiaxuan
    Zheng, Yang
    Han, Yuping
    [J]. PLOS ONE, 2021, 16 (08):
  • [3] Data clustering using K-Means based on Crow Search Algorithm
    K Lakshmi
    N Karthikeyani Visalakshi
    S Shanthi
    [J]. Sādhanā, 2018, 43
  • [4] Data clustering using K-Means based on Crow Search Algorithm
    Lakshmi, K.
    Visalakshi, N. Karthikeyani
    Shanthi, S.
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2018, 43 (11):
  • [5] Automatic Data Clustering based on Hybrid Atom Search Optimization and Sine-Cosine Algorithm
    Abd Elaziz, Mohamed
    Neggaz, Nabil
    Ewees, Ahmed A.
    Lu, Songfeng
    [J]. 2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2315 - 2322
  • [6] Empirical analysis of the impact of financial development on the income gap between urban and rural residents in the context of large data using fuzzy Kmeans clustering algorithm (Publication with Expression of Concern)
    Li, Yao
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING EDUCATION, 2020,
  • [7] A Density-Center-Based Automatic Clustering Algorithm for IoT Data Analysis
    Zhang, Tao
    Zhou, MengChu
    Guo, Xiwang
    Qi, Liang
    Abusorrah, Abdullah
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (24) : 24682 - 24694
  • [8] Automatic clustering and feature selection using gravitational search algorithm and its application to microarray data analysis
    Vijay Kumar
    Dinesh Kumar
    [J]. Neural Computing and Applications, 2019, 31 : 3647 - 3663
  • [9] Automatic clustering and feature selection using gravitational search algorithm and its application to microarray data analysis
    Kumar, Vijay
    Kumar, Dinesh
    [J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (08): : 3647 - 3663
  • [10] School-Based Management Performance Efficiency Modeling and Profiling using Data Envelopment Analysis and K-Means Clustering Algorithm
    Tibay, Jona P.
    Ambat, Shaneth C.
    Lagman, Ace C.
    [J]. 2019 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2019), 2019, : 149 - 153