Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept

被引:3
|
作者
Zada, Islam [1 ]
Ali, Shaukat [1 ]
Khan, Inayat [2 ]
Hadjouni, Myriam [3 ]
Elmannai, Hela [4 ]
Zeeshan, Muhammad [5 ]
Serat, Ali Mohammad [6 ]
Jameel, Abid [7 ]
机构
[1] Univ Peshawar, Dept Comp Sci, Peshawar, Pakistan
[2] Univ Buner, Dept Comp Sci, Buner, Pakistan
[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 84428, Riyadh 11671, Saudi Arabia
[4] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 84428, Riyadh 11671, Saudi Arabia
[5] Kohat Univ Sci & Technol, Inst Comp, Kohat, Pakistan
[6] Univ Nangarhar, Comp Sci Fac, Jalalabad, Nangarhar, Afghanistan
[7] Hazara Univ, Dept Comp Sci & Informat Technol, Mansehra, Pakistan
关键词
D O I
10.1155/2022/1277765
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms' outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm's results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] A High Speed Configurable FPGA Architecture For K-mean Clustering
    Kutty, Jithin Sankar Sankaran
    Boussaid, Farid
    Amira, Abbes
    2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2013, : 1801 - 1804
  • [22] A Combined K-Mean Semantic Approach for the Implicit Document Clustering
    Rehna, R. S.
    PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 535 - 544
  • [23] Improved Text Clustering Using k-Mean Bayesian Vectoriser
    Alghamdi, Hanan M.
    Selamat, Ali
    Karim, Nor Shahriza Abdul
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2014, 13 (03)
  • [24] TREE IDENTIFICATION USING A DISTRIBUTED K-MEAN CLUSTERING ALGORITHM
    Fan, K. T.
    Tzeng, Y. C.
    Lin, Y. F.
    Su, Y. J.
    Chen, K. S.
    2010 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2010, : 3446 - 3449
  • [25] Performance Analysis of Student Learning Metric using K-Mean Clustering Approach
    Shankar, Sonali
    Sarkar, Bishal Dey
    Sabitha, Sai
    Mehrotra, Deepti
    2016 6th International Conference - Cloud System and Big Data Engineering (Confluence), 2016, : 341 - 345
  • [26] Map segmentation by colour cube genetic K-Mean clustering
    Ramos, V
    Muge, F
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 2000, 1923 : 319 - 323
  • [27] A Review of Clustering Algorithms: Comparison of DBSCAN and K-mean with Oversampling and t-SNE
    Bajal E.
    Katara V.
    Bhatia M.
    Hooda M.
    Recent Patents on Engineering, 2022, 16 (02)
  • [28] Color Based Segmentation using K-Mean Clustering and Watershed Segmentation
    IshuGarg
    Kaur, Bikrampal
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3165 - 3169
  • [29] Methods of decreasing the number of support vectors via k-mean clustering
    Xia, XL
    Lyu, MR
    Lok, TM
    Huang, GB
    ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 717 - 726
  • [30] Analysis of K-Mean and X-Mean Clustering Algorithms Using Ontology-Based Dataset Filtering
    Rahmah, M.
    Raza, Muhammad Ahsan
    Fauziah, Z.
    Azhar, A. Nor
    Raza, Muhammad Fahad
    Raza, Binish
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (10): : 283 - 287