Weighted consensus clustering and its application to Big data

被引:17
|
作者
Alguliyev, Rasim M. [1 ]
Aliguliyev, Ramiz M. [1 ]
Sukhostat, Lyudmila, V [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, 9A B Vahabzade St, AZ-1141 Baku, Azerbaijan
关键词
Weighted consensus clustering; Big data; Utility function; Purity-based utility function; Co-association matrix; ENSEMBLE; ALGORITHM; INDEXES;
D O I
10.1016/j.eswa.2020.113294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this study is the development of a weighted consensus clustering that assigns weights to single clustering methods using the purity utility function. In the case of Big data that does not contain labels, the utility function based on the Davies-Bouldin index is proposed in this paper. The Banknote authentication, Phishing, Diabetic, Magic04, Credit card clients, Covertype, Phone accelerometer, and NSL-KDD datasets are used to assess the efficiency of the proposed consensus approach. The proposed approach is evaluated using the Euclidean, Minkowski, squared Euclidean, cosine, and Chebychev distance metrics. It is compared with single clustering algorithms (DBSCAN, OPTICS, CLARANS, k-means, and shared nearby neighbor clustering). The experimental results show the effectiveness of the proposed approach to the Big data clustering in comparison to single clustering methods. The proposed weighted consensus clustering using the squared Euclidean distance metric achieves the highest accuracy, which is a very promising result for Big data clustering. It can be applied to expert systems to help experts make group decisions based on several alternatives. The paper also provides directions for future research on consensus clustering in this area. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] I-HASTREAM: Density-Based Hierarchical Clustering of Big Data Streams and Its Application to Big Graph Analytics Tools
    Hassani, Marwan
    Cuzzocrea, Alfredo
    Spaus, Pascal
    Seidl, Thomas
    2016 16TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2016, : 656 - 665
  • [32] Improved quantum clustering analysis based on the weighted distance and its application
    Fan Decheng
    Jon, Song
    Pang, Cholho
    Dong, Wang
    Won, CholJin
    HELIYON, 2018, 4 (11):
  • [33] Big Data Clustering: A Review
    Shirkhorshidi, Ali Seyed
    Aghabozorgi, Saeed
    Teh, Ying Wah
    Herawan, Tutut
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2014, PT V, 2014, 8583 : 707 - 720
  • [34] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [35] Big Data and Clustering Algorithms
    Ajin, V. W.
    Kumar, Lekshmy D.
    2016 INTERNATIONAL CONFERENCE ON RESEARCH ADVANCES IN INTEGRATED NAVIGATION SYSTEMS (RAINS), 2016,
  • [36] Strategies for Big Data Clustering
    Kurasova, Olga
    Marcinkevicius, Virginijus
    Medvedev, Viktor
    Rapecka, Aurimas
    Stefanovic, Pavel
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 740 - 747
  • [37] Big Data clustering validity
    Tlili, Monia
    Hamdani, Tarek M.
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 348 - 352
  • [38] Overlapping community detection using weighted consensus clustering
    LINTAO YANG
    ZETAI YU
    JING QIAN
    SHOUYIN LIU
    Pramana, 2016, 87
  • [39] Overlapping community detection using weighted consensus clustering
    Yang, Lintao
    Yu, Zetai
    Qian, Jing
    Liu, Shouyin
    PRAMANA-JOURNAL OF PHYSICS, 2016, 87 (04):
  • [40] Sorensen-Dice Similarity Indexing based Weighted Iterative Clustering for Big Data Analytics
    Annathurai, KalyanaSaravanan
    Angamuthu, Tamilarasi
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (01) : 11 - 22