An ensemble agglomerative hierarchical clustering algorithm based on clusters clustering technique and the novel similarity measurement

被引:57
|
作者
Li, Teng [1 ]
Rezaeipanah, Amin [2 ]
El Din, ElSayed M. Tag [3 ]
机构
[1] Chongqing Coll Elect Engn, Artificial Intelligence & Big Data Coll, Chongqing 401331, Peoples R China
[2] Persian Gulf Univ, Dept Comp Engn, Bushehr, Iran
[3] Future Univ Egypt, Fac Engn & Technol, Elect Engn Dept, New Cairo 11845, Egypt
关键词
Hierarchical clustering; Meta-clusters; Ensemble clustering; Model selection; Similarity measurement; Clusters clustering; WEIGHTED ENSEMBLE; DENSITY;
D O I
10.1016/j.jksuci.2022.04.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of architectures such as the Internet of Things (IoT) has led to the dramatic growth of data and the production of big data. Managing this often-unlabeled data is a big challenge for the real world. Hierarchical Clustering (HC) is recognized as an efficient unsupervised approach to unlabeled data analysis. In data mining, HC is a mechanism for grouping data at different scales by creating a dendrogram. One of the most common HC methods is Agglomerative Hierarchical Clustering (AHC) in which clusters are created bottom-up. In addition, ensemble clustering approaches are used today in complex problems due to the weakness of individual clustering methods. Accordingly, we propose a clustering framework using AHC methods based on ensemble approaches, which includes the clusters clustering technique and a novel similarity measurement. The proposed algorithm is a Meta-Clustering Ensemble scheme based on Model Selection (MCEMS). MCEMS uses the bi-weighting policy to solve the model selection associated problem to improve ensemble clustering. Specifically, multiple AHC individual methods cluster the data from different aspects to form the primary clusters. According to the results of different methods, the similarity between the instances is calculated using a novel similarity measurement. The MCEMS scheme involves the creation of meta-clusters by re-clustering of primary clusters. After clusters clustering, the number of optimal clusters is determined by merging similar clusters and considering a threshold. Finally, the similarity of the instances to the meta-clusters is calculated and each instance is assigned to the meta-cluster with the highest similarity to form the final clusters. Simulations have been performed on some datasets from the UCI repository to evaluate MCEMS scheme compared to state-of-the-art algorithms. Extensive experiments clearly prove the superiority of MCEMS over HMM, DSPA and WHAC algorithms based on Wilcoxon test and Cophenetic correlation coefficient. (C) 2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University.
引用
收藏
页码:3828 / 3842
页数:15
相关论文
共 50 条
  • [1] An agglomerative hierarchical clustering algorithm based on global distance measurement
    Liu, Fang
    Wei, Yongqing
    Ren, Min
    Hou, Xiuyan
    Liu, Yingying
    2015 7TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME), 2015, : 363 - 367
  • [2] Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion
    Tian, Ping
    Shen, Huitao
    Abolfathi, Ahad
    JOURNAL OF GRID COMPUTING, 2022, 20 (04)
  • [3] Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion
    Ping Tian
    Huitao Shen
    Ahad Abolfathi
    Journal of Grid Computing, 2022, 20
  • [4] A Similarity Based Agglomerative Clustering Algorithm in Networks
    Liu, Zhiyuan
    Wang, Xiujuan
    Ma, Yinghong
    NINTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2017), 2018, 10615
  • [5] An Ensemble Clustering Framework Based on Hierarchical Clustering Ensemble Selection and Clusters Clustering
    Li, Wenjun
    Wang, Zikang
    Sun, Wei
    Bahrami, Sara
    CYBERNETICS AND SYSTEMS, 2023, 54 (05) : 741 - 766
  • [6] An Agglomerative Hierarchical Clustering Framework for Improving the Ensemble Clustering Process
    Jafarzadegan, Mohammad
    Safi-Esfahani, Faramarz
    Beheshti, Zahra
    CYBERNETICS AND SYSTEMS, 2022, 53 (08) : 679 - 701
  • [7] An Agglomerative Clustering Technique Based on a Global Similarity Metric
    Stanoev, Angel
    Trpevski, Igor
    Kocarev, Ljupco
    ICT INNOVATIONS 2010, 2011, 83 : 266 - 275
  • [8] An Automated Clustering Algorithm Based On Agglomerative Clustering
    Karabina, Armagan
    Kilic, Erdal
    2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), 2016, : 1801 - 1804
  • [9] Defining Hydrogeological Site Similarity with Hierarchical Agglomerative Clustering
    Kawa, Nura
    Cucchi, Karina
    Rubin, Yoram
    Attinger, Sabine
    Hesse, Falk
    GROUNDWATER, 2023, 61 (04) : 563 - 573
  • [10] Agglomerative Hierarchical Clustering of Emotions in Speech Based on Subjective Relative Similarity
    Takashima, Ryoichi
    Nagano, Tohru
    Tachibana, Ryuki
    Nishimura, Masafumi
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2484 - 2487