Replication Management Framework for HDFS based on Prediction Technique

被引:5
|
作者
Bui, Dinh-Mao [1 ]
Thien Huynh-The [1 ]
Lee, Sungyoung [1 ]
Li, Bin [2 ]
Wang, Jin [2 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Suwon, South Korea
[2] Yangzhou Univ, Coll Informat Engn, Yangzhou 225009, Jiangsu, Peoples R China
关键词
Replication; HDFS; proactive prediction; Bayesian Learning; Gaussian Process;
D O I
10.1109/CBD.2015.19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The number of application based on Apache Hadoop is increasing dramatically due to the robustness and dynamic features of this system. At the heart of Apache Hadoop, the Hadoop File System (HDFS) provides the reliability, scalability and high availability to computation by applying a static replication strategy. However, because of the characteristics of parallel operations on the application layer, the accessing frequency for each data file in HDFS is totally different. Consequently, maintaining the same replicating mechanism for every data file might lead to bad effects on the performance. By rigorously considering the drawbacks of HDFS architecture, this paper proposes an approach to dynamically replicate the data file based on the predictive analysis. With the help of probability theory, the utilization of each data file can be predicted to create an individual replication strategy. Eventually, the data file can subsequently be replicated depending on its own access potential. Hence, this approach simultaneously improves the data locality while keeping the analogous redundancy of data storage in comparison with the default replicating scheme.
引用
下载
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [31] A lightweight framework for prediction-based resource management in future wireless networks
    Eleni Patouni
    Damianos Kypriadis
    Nancy Alonistioti
    EURASIP Journal on Wireless Communications and Networking, 2012
  • [32] A SIMULATION-BASED PREDICTION FRAMEWORK FOR STOCHASTIC SYSTEM DYNAMIC RISK MANAGEMENT
    Xie, Wei
    Zhang, Pu
    Ryzhov, Ilya O.
    2018 WINTER SIMULATION CONFERENCE (WSC), 2018, : 1886 - 1897
  • [33] A lightweight framework for prediction-based resource management in future wireless networks
    Patouni, Eleni
    Kypriadis, Damianos
    Alonistioti, Nancy
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2012,
  • [34] Replication Based QoS Framework for Flash Arrays
    Altiparmak, Nihat
    Tosun, Ali Saman
    2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 182 - 190
  • [35] Improving Metadata Management for Small Files in HDFS
    Mackey, Grant
    Sehrish, Saba
    Wang, Jun
    2009 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING AND WORKSHOPS, 2009, : 621 - 624
  • [36] PRTuner: Proactive-Reactive Re-Replication Tuning in HDFS-based Cloud Data Center
    Shwe, Thanda
    Aritsugi, Masayoshi
    IEEE CLOUD COMPUTING, 2018, 5 (06): : 48 - 57
  • [37] A New HDFS Key Management Mechanism Based on Multi-level Hash Keychain
    Wang, Guiyuan
    Ning, Hongyun
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON BIG DATA RESEARCH (ICBDR 2018), 2018, : 163 - 169
  • [38] Cost Evaluation Framework for Fault Prediction Technique in Testing
    Behera, Aishwaryarani
    Das, Shrayas
    Ray, Abhishek
    ADVANCES IN DATA SCIENCE AND MANAGEMENT, 2020, 37 : 21 - 31
  • [39] HDFS Framework for Efficient Frequent Itemset Mining Using MapReduce
    Kulkarni, Prajakta G.
    Khonde, Shraddha R.
    2017 1ST INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND INFORMATION MANAGEMENT (ICISIM), 2017, : 171 - 178
  • [40] A framework for the prediction and management of environmental cracking problems
    Roberge, PR
    MATERIALS & DESIGN, 1995, 16 (06): : 349 - 358