Research and implementation of user clustering based on MapReduce in multimedia big data

被引：0

作者：

Tongke Fan

机构：

[1] Xi’an International University,School of Information and Network

来源：

Multimedia Tools and Applications | 2018年 / 77卷

关键词：

Multimedia big data; Cloud computing; Hadoop; MapReduce; Clustering algorithm;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Poor understanding and low clustering efficiency of massive data is a problem under the context of big data. To solve this problem, Canopy + K-means clustering algorithm is proposed, and the MapReduce programming model is used to make full use of the computing and storage capacity of Hadoop cluster. Large quantities of buyers on taobao are taken as application context to do case study through Hadoop platform’s data mining set Mahout. General procedure for miming with Mahout is also given. Clustering algorithm based on MapReduce shows preferable clustering quality and operation speed. Comparison is made between Canopy + K-means algorithm and K-means algorithm in respect of runtime, speed-up ratio and extendibility. Test is conducted for these two clustering algorithms on clusters with different numbers of nodes in context of dataset of various scales. The experimental results show that Canopy + K-means algorithm has faster operation speed than K-means algorithm, but both of them show good speed-up ratio under Hadoop environment and Canopy + K-means algorithm is even much better K-means algorithm.

引用

页码：10017 / 10031

页数：14

共 50 条

[1] Research and implementation of user clustering based on MapReduce in multimedia big data
Fan, Tongke
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (08) : 10017 - 10031
[2] MapReduce Clustering for Big Data
Ghattas, Badih
Pinto, Antoine
Diao, Sambou
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
[3] MapReduce based Method for Big Data Semantic Clustering
Yang, Jie
Li, Xiaoping
[J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2814 - 2819
[4] Big data clustering with varied density based on MapReduce
Safanaz Heidari
Mahmood Alborzi
Reza Radfar
Mohammad Ali Afsharkazemi
Ali Rajabzadeh Ghatari
[J]. Journal of Big Data, 6
[5] Big data clustering with varied density based on MapReduce
Heidari, Safanaz
Alborzi, Mahmood
Radfar, Reza
Afsharkazemi, Mohammad Ali
Ghatari, Ali Rajabzadeh
[J]. JOURNAL OF BIG DATA, 2019, 6 (01)
[6] Event Segmentation using MapReduce based Big Data Clustering
Shafiq, M. Omair
[J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1857 - 1866
[7] Clustering on Big Data Using Hadoop MapReduce
Akthar, Nadeem
Ahamad, Mohd Vasim
Khan, Shahbaz
[J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
[8] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
Zhang, Huajie
Song, Lei
Zhang, Sen
[J]. IAENG International Journal of Applied Mathematics, 2023, 53 (01)
[9] MapReduce Research on Warehousing of Big Data
Pticek, M.
Vrdoljak, B.
[J]. 2017 40TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2017, : 1361 - 1366
[10] A Big Graph Clustering Algorithm Based on MapReduce
Leng, Yonglin
Zhang, Qingchen
[J]. MODERN TECHNOLOGIES IN MATERIALS, MECHANICS AND INTELLIGENT SYSTEMS, 2014, 1049 : 1467 - +

← 1 2 3 4 5 →