Parallel Web Mining System Based on Cloud Platform

被引:1
|
作者
Shengmei Luo [1 ]
Qing He [2 ]
Lixia Liu [1 ]
Xiang Ao [2 ,3 ]
Ning Li [2 ,3 ]
Fuzhen Zhuang [2 ]
机构
[1] Pre-Research department of ZTE
[2] Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences
[3] Graduate University of Chinese Academy of Sciences
基金
中国国家自然科学基金;
关键词
web mining; large scale; high volume; high dimension; cloud computing;
D O I
暂无
中图分类号
TP393.09 []; TP391.1 [文字信息处理];
学科分类号
080402 ; 081203 ; 0835 ;
摘要
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients. Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.
引用
收藏
页码:45 / 53
页数:9
相关论文
共 50 条
  • [1] A Parallel Platform for Web Text Mining
    Ping Lu
    Zhenjiang Dong
    Shengmei Luo
    Lixia Liu
    Shanshan Guan
    Shengyu Liu
    Qingcai Chen
    [J]. ZTE Communications, 2013, 11 (03) : 56 - 61
  • [2] Examination System in the Cloud Computing Platform based on Data Mining
    Li Xiao-Feng
    Wang Jian-Hua
    Gao Wei-Wei
    [J]. PROCEEDINGS 2013 INTERNATIONAL CONFERENCE ON MECHATRONIC SCIENCES, ELECTRIC ENGINEERING AND COMPUTER (MEC), 2013, : 1605 - 1608
  • [3] Optimization of GRAPES System based on Parallel Supercomputing Grid Cloud Platform
    Xie Lei
    Wu Tao
    Zhang YuHan
    Xiao, Dan
    Huang Min
    Wu Xi
    [J]. 2018 10TH INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION (ICMTMA), 2018, : 501 - 506
  • [4] A System for Parallel Data Mining Service on Cloud
    Chen, Tao
    Chen, Jidong
    Zhou, Baoyao
    [J]. SECOND INTERNATIONAL CONFERENCE ON CLOUD AND GREEN COMPUTING / SECOND INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING AND ITS APPLICATIONS (CGC/SCA 2012), 2012, : 329 - 330
  • [5] A platform for parallel data mining on cluster system
    Wu, SC
    Wu, GF
    Yu, ZC
    Ban, H
    [J]. CURRENT TRENDS IN HIGH PERFORMANCE COMPUTING AND ITS APPLICATIONS, PROCEEDINGS, 2005, : 155 - 164
  • [6] A web-based parallel file transferring system on grid and cloud environments
    Department of Computer Science, Tunghai University, Taichung, 40704, Taiwan
    [J]. Proc. - Int. Symp. Parallel Distrib. Process. Appl., ISPA, (16-23):
  • [7] Research on data mining of electric power system based on Hadoop cloud computing platform
    Zhu J.
    [J]. International Journal of Computers and Applications, 2019, 41 (04) : 289 - 295
  • [8] Office in the Cloud: Web-based Cloud Platform for Telcos Services
    Suzuki, Masafumi
    Shimizu, Kentaro
    Muto, Shinyo
    Uchida, Naoki
    [J]. 2013 17TH INTERNATIONAL CONFERENCE ON INTELLIGENCE IN NEXT GENERATION NETWORKS (ICIN), 2013, : 39 - 45
  • [9] Research on parallel data processing of data mining platform in the background of cloud computing
    Bu, Lingrui
    Zhang, Hui
    Xing, Haiyan
    Wu, Lijun
    [J]. JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) : 479 - 486
  • [10] Implementation of Web Mining Algorithm Based on Cloud Computing
    Wu, Wei
    Chen, Yanming
    Seng, Dewen
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2017, 23 (04): : 599 - 604