Research and implementation of user clustering based on MapReduce in multimedia big data

被引：0

作者：

Tongke Fan

机构：

[1] Xi’an International University,School of Information and Network

来源：

Multimedia Tools and Applications | 2018年 / 77卷

关键词：

Multimedia big data; Cloud computing; Hadoop; MapReduce; Clustering algorithm;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Poor understanding and low clustering efficiency of massive data is a problem under the context of big data. To solve this problem, Canopy + K-means clustering algorithm is proposed, and the MapReduce programming model is used to make full use of the computing and storage capacity of Hadoop cluster. Large quantities of buyers on taobao are taken as application context to do case study through Hadoop platform’s data mining set Mahout. General procedure for miming with Mahout is also given. Clustering algorithm based on MapReduce shows preferable clustering quality and operation speed. Comparison is made between Canopy + K-means algorithm and K-means algorithm in respect of runtime, speed-up ratio and extendibility. Test is conducted for these two clustering algorithms on clusters with different numbers of nodes in context of dataset of various scales. The experimental results show that Canopy + K-means algorithm has faster operation speed than K-means algorithm, but both of them show good speed-up ratio under Hadoop environment and Canopy + K-means algorithm is even much better K-means algorithm.

引用

页码：10017 / 10031

页数：14

共 50 条

[21] User online behavior based on big data distributed clustering algorithm
Wang, Yan
[J]. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2020, 17 (02):
[22] A MapReduce Cortical Algorithms Implementation for Unsupervised Learning of Big Data
Hajj, Nadine
Rizk, Yara
Awad, Mariette
[J]. INNS CONFERENCE ON BIG DATA 2015 PROGRAM, 2015, 53 : 327 - 334
[23] EMR: Scalable Clustering of Big HR Data using Evolutionary MapReduce
Bohlouli, Mahdi
He, Zhonghua
[J]. WEB CONFERENCE 2021: COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2021), 2021, : 26 - 34
[24] Optimized big data K-means clustering using MapReduce
Cui, Xiaoli
Zhu, Pingfei
Yang, Xin
Li, Keqiu
Ji, Changqing
[J]. JOURNAL OF SUPERCOMPUTING, 2014, 70 (03): : 1249 - 1259
[25] Big Data Analytics based on PANFIS MapReduce
Za'in, Choiru
Pratama, Mahardhika
Lughofer, Edwin
Ferdaus, Meftahul
Cai, Qing
Prasad, Mukesh
[J]. INNS CONFERENCE ON BIG DATA AND DEEP LEARNING, 2018, 144 : 140 - 152
[26] Student Psychology based optimized routing algorithm for big data clustering in IoT with MapReduce framework
Shanmugam, Gowri
Thanarajan, Tamilvizhi
Rajendran, Surendran
Murugaraj, Sadish Sendil
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2051 - 2063
[27] Density-based Algorithms for Big Data Clustering Using MapReduce Framework: A Comprehensive Study
Khader, Mariam
Al-Naymat, Ghazi
[J]. ACM COMPUTING SURVEYS, 2020, 53 (05)
[28] K-Means Parallel Algorithm of Big Data Clustering Based on Mapreduce PCAM Method
Li, Yongyi
Yang, Zhongqiang
Han, Kaixu
[J]. Engineering Intelligent Systems, 2021, 29 (06): : 411 - 418
[29] Distributed Big Data Clustering using MapReduce-based Fuzzy C-Medoids
Sardar T.H.
Ansari Z.
[J]. Journal of The Institution of Engineers (India): Series B, 2022, 103 (01): : 73 - 82
[30] Optimized big data K-means clustering using MapReduce
Xiaoli Cui
Pingfei Zhu
Xin Yang
Keqiu Li
Changqing Ji
[J]. The Journal of Supercomputing, 2014, 70 : 1249 - 1259

← 1 2 3 4 5 →