Replication Management Framework for HDFS based on Prediction Technique

被引:5
|
作者
Bui, Dinh-Mao [1 ]
Thien Huynh-The [1 ]
Lee, Sungyoung [1 ]
Li, Bin [2 ]
Wang, Jin [2 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Suwon, South Korea
[2] Yangzhou Univ, Coll Informat Engn, Yangzhou 225009, Jiangsu, Peoples R China
关键词
Replication; HDFS; proactive prediction; Bayesian Learning; Gaussian Process;
D O I
10.1109/CBD.2015.19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The number of application based on Apache Hadoop is increasing dramatically due to the robustness and dynamic features of this system. At the heart of Apache Hadoop, the Hadoop File System (HDFS) provides the reliability, scalability and high availability to computation by applying a static replication strategy. However, because of the characteristics of parallel operations on the application layer, the accessing frequency for each data file in HDFS is totally different. Consequently, maintaining the same replicating mechanism for every data file might lead to bad effects on the performance. By rigorously considering the drawbacks of HDFS architecture, this paper proposes an approach to dynamically replicate the data file based on the predictive analysis. With the help of probability theory, the utilization of each data file can be predicted to create an individual replication strategy. Eventually, the data file can subsequently be replicated depending on its own access potential. Hence, this approach simultaneously improves the data locality while keeping the analogous redundancy of data storage in comparison with the default replicating scheme.
引用
下载
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [1] Supervised Learning based HDFS Replication Management System
    Ilakiyaa, R.
    Nalini, N. J.
    2017 INTERNATIONAL CONFERENCE ON TECHNICAL ADVANCEMENTS IN COMPUTERS AND COMMUNICATIONS (ICTACC), 2017, : 116 - 120
  • [2] Adaptive Replication Management in HDFS Based on Supervised Learning
    Bui, Dinh-Mao
    Hussain, Shujaat
    Huh, Eui-Nam
    Lee, Sungyoung
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (06) : 1369 - 1382
  • [3] An efficient replication management system for HDFS management
    Swaroopa K.
    Satya Phani Kumari A.
    Manne N.
    Satpathy R.
    Pavan Kumar T.
    Materials Today: Proceedings, 2023, 80 : 2799 - 2802
  • [4] Multicast-based Replication for Hadoop HDFS
    Wu, Jiadong
    Hong, Bo
    2015 16TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2015, : 143 - 148
  • [5] Placement Scheduling for Replication in HDFS Based on Probabilistic Approach
    Bui, Dinh-Mao
    Lee, Sungyoung
    INCLUSIVE SMART CITIES AND DIGITAL HEALTH, 2016, 9677 : 314 - 320
  • [6] Pseudo-Cache-Based IoT Small Files Management Framework in HDFS Cluster
    Isma Farah Siddiqui
    Nawab Muhammad Faseeh Qureshi
    Bhawani Shankar Chowdhry
    Muhammad Aslam Uqaili
    Wireless Personal Communications, 2020, 113 : 1495 - 1522
  • [7] A Metadata Management Mechanism Based on HDFS
    Chen, Xiaofeng
    Lou, Yuansheng
    Hu, Dongmei
    Applied Decisions in Area of Mechanical Engineering and Industrial Manufacturing, 2014, 577 : 1026 - 1029
  • [8] Pseudo-Cache-Based IoT Small Files Management Framework in HDFS Cluster
    Siddiqui, Isma Farah
    Qureshi, Nawab Muhammad Faseeh
    Chowdhry, Bhawani Shankar
    Uqaili, Muhammad Aslam
    WIRELESS PERSONAL COMMUNICATIONS, 2020, 113 (03) : 1495 - 1522
  • [9] Classification based Metadata Management for HDFS
    Chandrasekar, Ashok
    Chandrasekar, Karthik
    Ramasatagopan, Harini
    Rafica, A. R.
    Balasubramaniyan, Jagadeesh
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 1021 - 1026
  • [10] Dynamic Replication Policy on HDFS Based on Machine Learning Clustering
    Ahmed, Motaz A.
    Khafagy, Mohamed H.
    Shaheen, Masoud E.
    Kaseb, Mostafa R.
    IEEE ACCESS, 2023, 11 : 18551 - 18559