Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept

被引:3
|
作者
Zada, Islam [1 ]
Ali, Shaukat [1 ]
Khan, Inayat [2 ]
Hadjouni, Myriam [3 ]
Elmannai, Hela [4 ]
Zeeshan, Muhammad [5 ]
Serat, Ali Mohammad [6 ]
Jameel, Abid [7 ]
机构
[1] Univ Peshawar, Dept Comp Sci, Peshawar, Pakistan
[2] Univ Buner, Dept Comp Sci, Buner, Pakistan
[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 84428, Riyadh 11671, Saudi Arabia
[4] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 84428, Riyadh 11671, Saudi Arabia
[5] Kohat Univ Sci & Technol, Inst Comp, Kohat, Pakistan
[6] Univ Nangarhar, Comp Sci Fac, Jalalabad, Nangarhar, Afghanistan
[7] Hazara Univ, Dept Comp Sci & Informat Technol, Mansehra, Pakistan
关键词
D O I
10.1155/2022/1277765
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms' outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm's results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Analysis of Simple K-Mean and Parallel K-Mean Clustering for Software Products and Organizational Performance Using Education Sector Dataset
    Shang, Rui
    Ara, Balqees
    Zada, Islam
    Nazir, Shah
    Ullah, Zaid
    Khan, Shafi Ullah
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [2] PP K-MEAN CLUSTERING
    ZHANG Dixin
    ZHU Lixing Guizhou Planning College
    SystemsScienceandMathematicalSciences, 1993, (04) : 289 - 295
  • [3] k-mean alignment for curve clustering
    Sangalli, Laura M.
    Secchi, Piercesare
    Vantini, Simone
    Vitelli, Valeria
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (05) : 1219 - 1233
  • [4] K-mean clustering of miRNAs associated with cancer
    Sankar, Janani
    Thangavel, Dharani
    Murugesan, Nivetha
    Subramaniam, Nivedha
    Kothandan, Ram
    2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, INSTRUMENTATION AND CONTROL TECHNOLOGIES (ICICICT), 2017, : 211 - 214
  • [5] A k-mean clustering algorithm for mixed numeric and categorical data
    Ahmad, Amir
    Dey, Lipika
    DATA & KNOWLEDGE ENGINEERING, 2007, 63 (02) : 503 - 527
  • [6] Comparative Analysis of k-mean Based Algorithms
    Kumar, Parvesh
    Wasan, Siri Krishan
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (04): : 314 - 318
  • [7] Performance Evaluation of K-Mean and Fuzzy C-Mean Image Segmentation Based Clustering Classifier
    Shaaban, Hind R. M.
    Obaid, Farah Abbas
    Habib, Ali Abdulkarem
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2015, 6 (12) : 176 - 183
  • [8] Performance Diagnosis of Controller Based on Eigenvector Subspace K-mean Clustering
    Hao, Man
    Cao, Wei-Hua
    Wu, Min
    Yuan, Yan
    Liu, Zhen-Tao
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 4419 - 4423
  • [9] Image compression using K-mean clustering algorithm
    Munshi, Amani
    Alshehri, Asma
    Alharbi, Bayan
    AlGhamdi, Eman
    Banajjar, Esraa
    Albogami, Meznah
    Alshanbari, Hanan S.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (09): : 275 - 280
  • [10] Mineralogical Identification of Clays Using K-Mean Clustering
    Chebiyyam, Siva Kumar Prasad
    Kattamuri, Mallikarjuna Rao
    2ND INTERNATIONAL CONFERENCE ON SMART SUSTAINABLE MATERIALS AND TECHNOLOGIES, VOL 1, ICSSMT 2023, 2024, : 227 - 236