Performance Evaluation of Simple K-Mean and Parallel K-Mean Clustering Algorithms: Big Data Business Process Management Concept

被引：3

作者：

Zada, Islam ^{[1
]}

Ali, Shaukat ^{[1
]}

Khan, Inayat ^{[2
]}

Hadjouni, Myriam ^{[3
]}

Elmannai, Hela ^{[4
]}

Zeeshan, Muhammad ^{[5
]}

Serat, Ali Mohammad ^{[6
]}

Jameel, Abid ^{[7
]}

机构：

[1] Univ Peshawar, Dept Comp Sci, Peshawar, Pakistan

[2] Univ Buner, Dept Comp Sci, Buner, Pakistan

[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 84428, Riyadh 11671, Saudi Arabia

[4] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 84428, Riyadh 11671, Saudi Arabia

[5] Kohat Univ Sci & Technol, Inst Comp, Kohat, Pakistan

[6] Univ Nangarhar, Comp Sci Fac, Jalalabad, Nangarhar, Afghanistan

[7] Hazara Univ, Dept Comp Sci & Informat Technol, Mansehra, Pakistan

来源：

MOBILE INFORMATION SYSTEMS | 2022年 / 2022卷

关键词：

D O I：

10.1155/2022/1277765

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Data is the most valuable asset in any firm. As time passes, the data expands at a breakneck speed. A major research issue is the extraction of meaningful information from a complex and huge data source. Clustering is one of the data extraction methods. The basic K-Mean and Parallel K-Mean partition clustering algorithms work by picking random starting centroids. The basic and K-Mean parallel clustering methods are investigated in this work using two different datasets with sizes of 10000 and 5000, respectively. The findings of the Simple K-Mean clustering algorithms alter throughout numerous runs or iterations, according to the study, and so iterations differ for each run or execution. In some circumstances, the clustering algorithms' outcomes are always different, and the algorithms separate and identify unique properties of the K-Mean Simple clustering algorithm from the K-Mean Parallel clustering algorithm. Differentiating these features will improve cluster quality, lapsed time, and iterations. Experiments are designed to show that parallel algorithms considerably improve the Simple K-Mean techniques. The findings of the parallel techniques are also consistent; however, the Simple K-Mean algorithm's results vary from run to run. Both the 10,000 and 5000 data item datasets are divided into ten subdatasets for ten different client systems. Clusters are generated in two iterations, i.e., the time it takes for all client systems to complete one iteration (mentioned in chapter number 4). In the first execution, Client No. 5 has the longest elapsed time (8 ms), whereas the longest elapsed time in the following iterations is 6 ms, for a total elapsed time of 12 ms for the K-Mean clustering technique. In addition, the Parallel algorithms reduce the number of executions and the time it takes to complete a task.

引用

页数：15

共 50 条

[21] A High Speed Configurable FPGA Architecture For K-mean Clustering
Kutty, Jithin Sankar Sankaran
Boussaid, Farid
Amira, Abbes
2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2013, : 1801 - 1804
[22] A Combined K-Mean Semantic Approach for the Implicit Document Clustering
Rehna, R. S.
PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 535 - 544
[23] Improved Text Clustering Using k-Mean Bayesian Vectoriser
Alghamdi, Hanan M.
Selamat, Ali
Karim, Nor Shahriza Abdul
JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2014, 13 (03)
[24] TREE IDENTIFICATION USING A DISTRIBUTED K-MEAN CLUSTERING ALGORITHM
Fan, K. T.
Tzeng, Y. C.
Lin, Y. F.
Su, Y. J.
Chen, K. S.
2010 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2010, : 3446 - 3449
[25] Performance Analysis of Student Learning Metric using K-Mean Clustering Approach
Shankar, Sonali
Sarkar, Bishal Dey
Sabitha, Sai
Mehrotra, Deepti
2016 6th International Conference - Cloud System and Big Data Engineering (Confluence), 2016, : 341 - 345
[26] Map segmentation by colour cube genetic K-Mean clustering
Ramos, V
Muge, F
RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, PROCEEDINGS, 2000, 1923 : 319 - 323
[27] A Review of Clustering Algorithms: Comparison of DBSCAN and K-mean with Oversampling and t-SNE
Bajal E.
Katara V.
Bhatia M.
Hooda M.
Recent Patents on Engineering, 2022, 16 (02)
[28] Color Based Segmentation using K-Mean Clustering and Watershed Segmentation
IshuGarg
Kaur, Bikrampal
PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 3165 - 3169
[29] Methods of decreasing the number of support vectors via k-mean clustering
Xia, XL
Lyu, MR
Lok, TM
Huang, GB
ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 717 - 726
[30] Analysis of K-Mean and X-Mean Clustering Algorithms Using Ontology-Based Dataset Filtering
Rahmah, M.
Raza, Muhammad Ahsan
Fauziah, Z.
Azhar, A. Nor
Raza, Muhammad Fahad
Raza, Binish
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (10): : 283 - 287

← 1 2 3 4 5 →