Performance Enhancement of Distributed K-Means Clustering for Big Data Analytics Through In-memory Computation

被引：0

作者：

Ketu, Shwet ^{[1
]}

Agarwal, Sonali ^{[1
]}

机构：

[1] Indian Inst Informat Technol, Allahabad, Uttar Pradesh, India

来源：

2015 EIGHTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3) | 2015年

关键词：

Big data; Big data analytic; Distributed K-Mean; Hadoop MapReduce; Apche Spark; On- disk computation; In-memory computation;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Big Data analytics are recently coming up as prominent research area in the field of Information Technology serving various data driven domains for effective processing of big data. Big data analytics have been facing various challenges such as inefficient storage, processing delays, low rate of information retrieval, complex algorithms which cannot be handled and managed using traditional methods. For assisting software developers to deal with big data challenges, new programming frameworks are required. In this research paper Hadoop MapReduce and Apache Spark are taken for this purpose which supports on-disk and in-memory computation respectively. Clustering is one of the important tasks of big data mining used for information retrieval and knowledge discovery. In this research work, we are analyzing the performance of distributed K-Means clustering based on in-memory and on-disk computational models. For performance enhancement of distributed K-Means clustering, in-memory and on-disk computational models have been adopted and an experimental analysis has been performed.

引用

下载

页码：318 / 324

页数：7

共 50 条

[21] A Novel K-Means based Clustering Algorithm for Big Data
Sinha, Ankita
Jana, Prasanta K.
2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 1875 - 1879
[22] Optimized big data K-means clustering using MapReduce
Cui, Xiaoli
Zhu, Pingfei
Yang, Xin
Li, Keqiu
Ji, Changqing
JOURNAL OF SUPERCOMPUTING, 2014, 70 (03): : 1249 - 1259
[23] Improvement of K-Means Algorithm for Accelerated Big Data Clustering
Wu, Chunqiong
Yan, Bingwen
Yu, Rongrui
Huang, Zhangshu
Yu, Baoqin
Yu, Yanliang
Chen, Na
Zhou, Xiukao
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2021, 14 (02) : 99 - 119
[24] Improved k-Means Clustering Algorithm for Big Data Based on Distributed SmartphoneNeural Engine Processor
Awad, Fouad H.
Hamad, Murtadha M.
ELECTRONICS, 2022, 11 (06)
[25] Bridging High Velocity and High Volume Industrial Big Data Through Distributed In-Memory Storage & Analytics
Williams, Jenny Weisenberg
Aggour, Kareem S.
Interrante, John
McHugh, Justin
Pool, Eric
2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 932 - 941
[26] Enhancement of the K-Means Algorithm for Mixed Data in Big Data Platforms
Koren, Oded
Hallin, Carina Antonia
Perel, Nir
Bendet, Dror
INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, 2019, 868 : 1025 - 1040
[27] K-Means Clustering with Distributed Dimensions
Ding, Hu
Liu, Yu
Huang, Lingxiao
Li, Jian
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[28] NEW ALGORITHM FOR CLUSTERING DISTRIBUTED DATA USING K-MEANS
Khedr, Ahmed M.
Bhatnagar, Raj K.
COMPUTING AND INFORMATICS, 2014, 33 (04) : 943 - 964
[29] An Enhancement of K-means Clustering Algorithm
Gu, Jirong
Zhou, Jieming
Chen, Xianwei
2009 INTERNATIONAL CONFERENCE ON BUSINESS INTELLIGENCE AND FINANCIAL ENGINEERING, PROCEEDINGS, 2009, : 237 - 240
[30] In-Memory Performance for Big Data
Graefe, Goetz
Volos, Haris
Kimura, Hideaki
Kuno, Harumi
Tucek, Joseph
Lillibridge, Mark
Veitch, Alistair
PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 8 (01): : 37 - 48

← 1 2 3 4 5 →