Replica Scheduling Strategy for Streaming Data Mining

被引:0
|
作者
Li, Shufan [1 ]
Yu, Siyuan [1 ]
Xiao, Fang [2 ]
机构
[1] Wuhan Univ Technol, Comp Sci & Artificial Intelligence Sch, Wuhan, Peoples R China
[2] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Peoples R China
关键词
Streaming data mining; dynamic programming; replica scheduling strategy; cloud computing;
D O I
10.14569/IJACSA.2022.0130503
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In a distributed storage and computing framework, traditional streaming data mining techniques are inefficient when processing massive amounts of data. In this paper, we take the copy in cloud storage as an allocatable resource for scheduling and propose a RepRM strategy to improve the efficiency of data mining and analysis. The key idea of this work is to take the data copy as the resource to be allocated, and use the backward inference method of dynamic programming to solve the data copy ratio, the optimal number of copies is obtained. Experiments and observations have proved that compared with the traditional scheduling method of Hadoop, after adopting the RepRM strategy scheduling, the memory resources of the homogeneous cluster are saved by about 40-50% during parallel mining of streaming data, and the throughput rate is increased by 20% to 30%.
引用
收藏
页码:10 / 19
页数:10
相关论文
共 50 条
  • [1] A Optimal Scheduling Strategy for Data-Driven Peer-to-Peer Streaming
    Huang, Guowei
    Chen, Zhi
    [J]. COMPUTER AND INFORMATION SCIENCE 2012, 2012, 429 : 165 - +
  • [2] A Shared Execution Strategy for Multiple Pattern Mining Requests over Streaming Data
    Yang, Di
    Rundensteiner, Elke A.
    Ward, Matthew O.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (01): : 874 - 885
  • [3] Mining streaming emerging patterns from streaming data
    Alhammady, Hamad
    [J]. 2007 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2007, : 432 - 436
  • [4] An Impact of Scheduling Strategy to Parallel FI-Growth Data Mining Algorithm
    Benjamas, Nunnapus
    Uthayopas, Putchong
    [J]. ADVANCES IN INFORMATION TECHNOLOGY, PROCEEDINGS, 2009, 55 : 39 - 47
  • [5] A dynamic replica management strategy in data grid
    Mansouri, Najme
    Dastghaibyfard, Gholam Hosein
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2012, 35 (04) : 1297 - 1303
  • [6] Replica Placement Strategy for Data Grid Environment
    Madi, Mohammed K.
    Yusof, Yuhanis
    Hassan, Suhaidi
    [J]. INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2013, 5 (01) : 70 - 81
  • [7] Probabilistic trust aware data replica placement strategy for online video streaming applications in vehicular delay tolerant networks
    Kumar, Neeraj
    Kim, Jongsung
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 2013, 58 (1-2) : 3 - 14
  • [8] The Research and Application on Streaming Data of GIS Data Mining
    Liu Jia
    Liu Lin
    [J]. FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 209 - +
  • [9] Scalable Scheduling of Updates in Streaming Data Warehouses
    Golab, Lukasz
    Johnson, Theodore
    Shkapenyuk, Vladislav
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (06) : 1092 - 1105
  • [10] On scheduling video streaming data in the HDR system
    Gribanova, K
    Jäntti, W
    [J]. VTC2004-FALL: 2004 IEEE 60TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-7: WIRELESS TECHNOLOGIES FOR GLOBAL SECURITY, 2004, : 2572 - 2576