Replication Management Framework for HDFS based on Prediction Technique

被引:5
|
作者
Bui, Dinh-Mao [1 ]
Thien Huynh-The [1 ]
Lee, Sungyoung [1 ]
Li, Bin [2 ]
Wang, Jin [2 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Suwon, South Korea
[2] Yangzhou Univ, Coll Informat Engn, Yangzhou 225009, Jiangsu, Peoples R China
关键词
Replication; HDFS; proactive prediction; Bayesian Learning; Gaussian Process;
D O I
10.1109/CBD.2015.19
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The number of application based on Apache Hadoop is increasing dramatically due to the robustness and dynamic features of this system. At the heart of Apache Hadoop, the Hadoop File System (HDFS) provides the reliability, scalability and high availability to computation by applying a static replication strategy. However, because of the characteristics of parallel operations on the application layer, the accessing frequency for each data file in HDFS is totally different. Consequently, maintaining the same replicating mechanism for every data file might lead to bad effects on the performance. By rigorously considering the drawbacks of HDFS architecture, this paper proposes an approach to dynamically replicate the data file based on the predictive analysis. With the help of probability theory, the utilization of each data file can be predicted to create an individual replication strategy. Eventually, the data file can subsequently be replicated depending on its own access potential. Hence, this approach simultaneously improves the data locality while keeping the analogous redundancy of data storage in comparison with the default replicating scheme.
引用
下载
收藏
页码:58 / 63
页数:6
相关论文
共 50 条
  • [41] A Distributed File System Based on HDFS
    Liu J.
    Leng F.-L.
    Li S.-Q.
    Bao Y.-B.
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2019, 40 (06): : 795 - 800
  • [42] A Framework for Health Management Services in Nanofiber Technique-based Wellness Wear Systems
    Kim, Hee-Cheol
    Chung, Gi-Soo
    Kim, Tae-Woong
    2009 11TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATIONS AND SERVICES (HEALTHCOM 2009), 2009, : 70 - +
  • [43] HDFS Enabled Storage and Management of Remote Sensing Data
    Kou, Weili
    Yang, Xuejing
    Liang, Changxian
    Xie, Changbo
    Gan, Shu
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 80 - 84
  • [44] SHAstor: A Scalable HDFS-based Storage Framework for Small-Write Efficiency in Pervasive Computing
    Zeng, Lingfang
    Shi, Wei
    Ni, Fan
    Song, Jiang
    Fan, Xiaopeng
    Xu, Chengzhong
    Wang, Yang
    2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 1140 - 1145
  • [45] Adaptive learning-based time series prediction framework for building energy management
    Schachinger, Daniel
    Pannosch, Juergen
    Kastner, Wolfgang
    2018 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ELECTRONICS FOR SUSTAINABLE ENERGY SYSTEMS (IESES), 2018, : 453 - 458
  • [46] Prediction of the Battery State Using the Digital Twin Framework Based on the Battery Management System
    Jafari, Sadiqa
    Byun, Yung-Cheol
    IEEE ACCESS, 2022, 10 : 124685 - 124696
  • [47] A Validity-Based Framework for Understanding Replication in Psychology
    Fabrigar, Leandre R.
    Wegener, Duane T.
    Petty, Richard E.
    PERSONALITY AND SOCIAL PSYCHOLOGY REVIEW, 2020, 24 (04) : 316 - 344
  • [48] A Machine Learning Framework for Prediction Interval based Technique for Short-Term Solar Energy Forecast
    Kumar, Dhivya Sampath
    Teo, Winnie
    Koh, Ngiap
    Sharma, Anurag
    Woo, Wai Lok
    PROCEEDINGS OF 2020 6TH IEEE INTERNATIONAL WOMEN IN ENGINEERING (WIE) CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE 2020), 2020, : 410 - 413
  • [49] Storage and Accessing Small Files Based on HDFS
    Mao, Yingchi
    Min, Wei
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSAIT 2013), 2014, 255 : 565 - 573
  • [50] Smallfiles on HDFS merging based on the energy efficiency
    Yu, Jun-Yang
    Hu, Zhi-Gang
    Liu, Xiu-Lei
    Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2015, 38 (06): : 34 - 38