Optimizing data placement in heterogeneous Hadoop clusters

被引:0
|
作者
Runqun Xiong
Junzhou Luo
Fang Dong
机构
[1] Southeast University,School of Computer Science and Engineering
来源
Cluster Computing | 2015年 / 18卷
关键词
Hadoop cluster; HDFS; Data placement; Heterogeneous; Replica;
D O I
暂无
中图分类号
学科分类号
摘要
Data placement decision of Hadoop distributed file system (HDFS) is very important for the data locality which is a primary criterion for task scheduling of MapReduce model and eventually affects the application performance. The existing HDFS’s rack-aware data placement strategy and replication scheme are work well with MapReduce framework in homogeneous Hadoop clusters, but in practice, such data placement policy can noticeably reduce MapReduce performance and may cause increasingly energy dissipation in heterogeneous environments. Besides that, HDFS employs an inflexible replica factor acquiescently for each data block, which will give rise to unnecessary waste of storage space when there is a lot of inactive data in Hadoop system. In this paper, we propose a novel data placement strategy (SLDP) for heterogeneous Hadoop clusters. SLDP adopts a heterogeneity aware algorithm to divide various nodes into several virtual storage tiers (VSTs) firstly, and then places data blocks across nodes in each VST circuitously according to the hotness of data. Furthermore, SLDP uses a hotness proportional replication to save disk space and also has an effective power control function. Experimental results on two real data-intensive applications show that SLDP is energy-efficient, space-saving and able to improve MapReduce performance in a heterogeneous Hadoop cluster significantly.
引用
收藏
页码:1465 / 1480
页数:15
相关论文
共 50 条
  • [1] Optimizing data placement in heterogeneous Hadoop clusters
    Xiong, Runqun
    Luo, Junzhou
    Dong, Fang
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (04): : 1465 - 1480
  • [2] On a Dynamic Data Placement Strategy for Heterogeneous Hadoop Clusters
    Liu, Yang
    Wu, Chase Q.
    Wang, Meng
    Hou, Aiqin
    Wang, Yongqiang
    2018 INTERNATIONAL SYMPOSIUM ON NETWORKS, COMPUTERS AND COMMUNICATIONS (ISNCC 2018), 2018,
  • [3] An improved data placement strategy in a heterogeneous Hadoop cluster
    Zhao, Wentao
    Meng, Lingjun
    Sun, Jiangfeng
    Ding, Yang
    Zhao, Haohao
    Wang, Lina
    Open Cybernetics and Systemics Journal, 2014, 8 (01): : 957 - 963
  • [4] An Improved data placement strategy in a heterogeneous hadoop cluster
    Zhao, Wentao
    Meng, Lingjun
    Sun, Jiangfeng
    Ding, Yang
    Zhao, Haohao
    Wang, Lina
    Open Cybernetics and Systemics Journal, 2015, 9 (01): : 792 - 798
  • [5] A Dynamic Data Placement Strategy for Hadoop in Heterogeneous Environments
    Lee, Chia-Wei
    Hsieh, Kuang-Yu
    Hsieh, Sun-Yuan
    Hsiao, Hung-Chang
    BIG DATA RESEARCH, 2014, 1 : 14 - 22
  • [6] A Dynamic Data Placement Policy for Heterogeneous Hadoop Cluster
    Shithil, Santa Maria
    Saha, Tushar Kanti
    Sharma, Tanusree
    2017 4TH INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL ENGINEERING (ICAEE), 2017, : 302 - 307
  • [7] HaDaap: A hotness-aware data placement strategy for improving storage efficiency in heterogeneous Hadoop clusters
    Xiong, Runqun
    Du, Yao
    Jin, Jiahui
    Luo, Junzhou
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (20):
  • [8] Optimizing Hadoop Scheduling in Single-Board-Computer-Based Heterogeneous Clusters
    Qureshi, Basit
    COMPUTATION, 2024, 12 (05)
  • [9] Optimizing the Placement of Data Collection Services on Vehicle Clusters
    Sharma, Kanika
    Butler, Bernard
    Jennings, Brendan
    Kennedy, John
    Loomba, Radhika
    2018 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2018, : 1800 - 1806
  • [10] Novel data-placement scheme for improving the data locality of Hadoop in heterogeneous environments
    Bae, Minho
    Yeo, Sangho
    Park, Gyudong
    Oh, Sangyoon
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (18):