K-means Clustering Algorithm for Large-scale Chinese Commodity Information Web Based on Hadoop

被引：2

作者：

Geng Yushui ^{[1
]}

Zhang Lishuo ^{[1
]}

机构：

[1] Qilu Univ Technol, Sch Informat, Jinan 250353, Peoples R China

来源：

14TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS, ENGINEERING AND SCIENCE (DCABES 2015) | 2015年

关键词：

K-Means clustering algorithm; Hadoop platform; MapReduce; Cloud computing; Big Data;

D O I：

10.1109/DCABES.2015.71

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

With the growing popularity of the network, product information filled in the many pages of the Internet, which you want to get the information you need on these pages tend to consider clustering information, and the current explosive growth of data so that the information mass storage condition occurs, clustering to facing the problems such as large calculation complexity and time consuming, then the traditional K-Means clustering algorithm does not meet the needs of large data environments today, so this article combined with the advantages of the Hadoop platform and MapReduce programming model is proposed the K-Means clustering algorithm for large-scale chinese commodity information Web based on Hadoop. Map function calculates the distance from the cluster center for each sample and mark to their category, Reduce function intermediate results are summarized and calculated new clustering center for the next round of iteration. Experimental results show that this method can better improve the clustering processing speed.

引用

页码：256 / 259

页数：4

共 50 条

[1] The Application of K-Means Clustering Algorithm Based on Hadoop
Zhong, Yurong
Liu, Dan
[J]. PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2016), 2016, : 88 - 92
[2] Scalable k-means for large-scale clustering
Ming, Yuewei
Zhu, En
Wang, Mao
Liu, Qiang
Liu, Xinwang
Yin, Jianping
[J]. INTELLIGENT DATA ANALYSIS, 2019, 23 (04) : 825 - 838
[3] Compressed K-Means for Large-Scale Clustering
Shen, Xiaobo
Liu, Weiwei
Tsang, Ivor
Shen, Fumin
Sun, Quan-Sen
[J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2527 - 2533
[4] Optimization of K-means Clustering Algorithm Based on Hadoop Platform
Duan, A. L.
Xu, Z. X.
Zhang, H. J.
[J]. INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENVIRONMENTAL ENGINEERING (CSEE 2015), 2015, : 1195 - 1203
[5] An Improved K-means Clustering Algorithm Based on Hadoop Platform
Hou, Xiangru
[J]. CYBER SECURITY INTELLIGENCE AND ANALYTICS, 2020, 928 : 1101 - 1109
[6] A Semantic Partition Algorithm Based on Improved K-Means Clustering for Large-Scale Indoor Areas
Shi, Kegong
Yan, Jinjin
Yang, Jinquan
[J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2024, 13 (02)
[7] Optimal Operation of Large-scale Electric Vehicles Based on Improved K-means Clustering Algorithm
Liu, Jian
Xu, Weifeng
Liu, Zhijun
Fu, Guanhua
Jiang, Yunpeng
Zhao, Ergang
[J]. PROCEEDINGS OF 2022 5TH INTERNATIONAL CONFERENCE ON ROBOT SYSTEMS AND APPLICATIONS, ICRSA2022, 2022, : 23 - 28
[8] Efficient adaptive large-scale text clustering method based on genetic K-means algorithm
Dai, Wenhua
Jiao, Cuizhen
He, Tingting
[J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 281 - 285
[9] Chinese text clustering algorithm based k-means
Yao, Mingyu
Pi, Dechang
Cong, Xiangxiang
[J]. 2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 301 - 307
[10] Chinese Text Clustering Algorithm Based K-Means
Yao, Mingyu
Pi, Dechang
Cong, Xiangxiang
[J]. 2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 90 - 93

← 1 2 3 4 5 →