Adaptive Replication Management in HDFS Based on Supervised Learning

被引:27
|
作者
Bui, Dinh-Mao [1 ]
Hussain, Shujaat [1 ]
Huh, Eui-Nam [1 ]
Lee, Sungyoung [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Suwon 446701, South Korea
基金
新加坡国家研究基金会;
关键词
Replication; HDFS; proactive prediction; optimization; Bayesian learning; Gaussian process; ERASURE CODES;
D O I
10.1109/TKDE.2016.2523510
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The number of applications based on Apache Hadoop is dramatically increasing due to the robustness and dynamic features of this system. At the heart of Apache Hadoop, the Hadoop Distributed File System (HDFS) provides the reliability and high availability for computation by applying a static replication by default. However, because of the characteristics of parallel operations on the application layer, the access rate for each data file in HDFS is completely different. Consequently, maintaining the same replication mechanism for every data file leads to detrimental effects on the performance. By rigorously considering the drawbacks of the HDFS replication, this paper proposes an approach to dynamically replicate the data file based on the predictive analysis. With the help of probability theory, the utilization of each data file can be predicted to create a corresponding replication strategy. Eventually, the popular files can be subsequently replicated according to their own access potentials. For the remaining low potential files, an erasure code is applied to maintain the reliability. Hence, our approach simultaneously improves the availability while keeping the reliability in comparison to the default scheme. Furthermore, the complexity reduction is applied to enhance the effectiveness of the prediction when dealing with Big Data.
引用
收藏
页码:1369 / 1382
页数:14
相关论文
共 50 条
  • [21] Adaptive Beam Sweeping With Supervised Learning
    Lei, Wanlu
    Lu, Chenguang
    Huang, Yezi
    Rao, Jing
    Xiao, Ming
    Skoglund, Mikael
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2022, 11 (12) : 2650 - 2654
  • [22] Pertinent user profile based on adaptive semi-supervised learning
    Rebai, Rim Zghal
    Ghorbel, Leila
    Zayani, Corinne Amel
    Amous, Ikram
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 313 - 320
  • [23] Sample Selection Method in Supervised Learning Based on Adaptive Estimated Threshold
    Zhang, Zeya
    Zhou, Zhiheng
    Shen, Dongkai
    PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 1861 - 1864
  • [24] Proactive Re-replication Strategy in HDFS based Cloud Data Center
    Shwe, Thanda
    Aritsugi, Masayoshi
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC' 17), 2017, : 121 - 130
  • [25] Adaptive Active Learning for Semi-supervised Learning
    Li Y.-C.
    Xiao F.
    Chen Z.
    Li B.
    Ruan Jian Xue Bao/Journal of Software, 2020, 31 (12): : 3808 - 3822
  • [26] A Supervised Learning framework for Learning Management Systems
    Olive, David Monllao
    Huynh, Du Q.
    Reynolds, Mark
    Dougiamas, Martin
    Wiese, Damyon
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON DATA SCIENCE, E-LEARNING AND INFORMATION SYSTEMS 2018 (DATA'18), 2018,
  • [27] A Dynamic and Static Combined Replication Management Mechanism Based on Frequency Adaptive
    He, Zhenli
    Zhou, Hua
    Hu, Long
    Liu, Junhui
    Su, Lei
    CLOUD COMPUTING (CLOUDCOMP 2014), 2015, 142 : 116 - 125
  • [28] The online scene-adaptive tracker based on self-supervised learning
    Xiaoyu Chen
    Mingyang Chen
    Jinru Hang
    Fengchen He
    Wei Qi
    Jing Han
    Multimedia Tools and Applications, 2023, 82 : 15695 - 15713
  • [29] Adaptive safety degree-based safe semi-supervised learning
    Sang, Nong
    Gan, Haitao
    Fan, Yingle
    Wu, Wei
    Yang, Zhi
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (05) : 1101 - 1108
  • [30] Adaptive safety degree-based safe semi-supervised learning
    Nong Sang
    Haitao Gan
    Yingle Fan
    Wei Wu
    Zhi Yang
    International Journal of Machine Learning and Cybernetics, 2019, 10 : 1101 - 1108