QoS-Aware Data Placement for MapReduce Applications in Geo-Distributed Data Centers

被引:10
|
作者
Chen, Wuhui [1 ,2 ]
Liu, Baichuan [1 ,2 ]
Paik, Incheon [3 ]
Li, Zhenni [4 ]
Zheng, Zibin [1 ,2 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510085, Peoples R China
[2] Sun Yat Sen Univ, Natl Engn Res Ctr Digital Life, Guangzhou 510085, Peoples R China
[3] Univ Aizu, Sch Comp Sci & Engn, Aizu Wakamatsu, Fukushima 9650006, Japan
[4] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Data centers; Quality of service; Data transfer; Distributed databases; Data models; Optimization; Network topology; Big-data processing; data placement; geo-distributed data centers; QoS aware; BIG DATA; CLOUD;
D O I
10.1109/TEM.2020.2971717
中图分类号
F [经济];
学科分类号
02 ;
摘要
With growing data volumes and the scaling of data center clusters, communication resources often become a bottleneck in service provisioning for many MapReduce applications (e.g., training machine learning models). Therefore, data placements that bring data blocks closer to data consumers (e.g., MapReduce applications) are seen as a promising solution. In this article, we propose an efficient data-placement technique that considers network traffic reduction as well as QoS guarantees for the data blocks to optimize the communication resources. We first formulate the joint optimization of the data-placement problem, propose a generic model for minimizing communication costs, and show that the joint data-placement problem is NP-hard. To solve this problem, we propose a heuristic algorithm considering traffic flows in the network topology of data centers by first seeking optimal QoS-aware data placement based on golden division on a Zipflike replica distribution, then transforming the joint data-placement problem into a block-dependence tree (BDT) construction problem, and finally reducing the BDT construction to a graph-partitioning problem. The experimental results demonstrate that our data-placement approach could effectively improve the performance of MapReduce jobs with lower communication costs and less job execution time for big-data processing.
引用
收藏
页码:120 / 136
页数:17
相关论文
共 50 条
  • [11] QoS-aware replica placement for data intensive applications
    Fu, Xiong
    Zhu, Xin-Xin
    Han, Jing-Yu
    Wang, Ru-Chuan
    Journal of China Universities of Posts and Telecommunications, 2013, 20 (03): : 43 - 47
  • [12] QoS-Aware Distributed Replica Placement in Hierarchical Data Grids
    Shorfuzzaman, Mohammad
    Graham, Peter
    Eskicioglu, Rasit
    25TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA 2011), 2011, : 291 - 299
  • [13] Congestion-Aware Traffic Allocation for Geo-Distributed Data Centers
    Tao, Xiaoyi
    Ota, Kaoru
    Dong, Mianxiong
    Borjigin, Wuyunzhaola
    Qi, Heng
    Li, Keqiu
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2022, 10 (03) : 1675 - 1687
  • [14] Type-aware Task Placement in Geo-distributed Data Centers with Low OPEX using Data Center Resizing
    Gu, Lin
    Zeng, Deze
    Quo, Song
    Yu, Shui
    2014 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS (ICNC), 2014, : 211 - 215
  • [15] Workload-Aware Scheduling Across Geo-distributed Data Centers
    Jin, Yibo
    Gao, Yuan
    Qian, Zhuzhong
    Zhai, Mingyu
    Peng, Hui
    Lu, Sanglu
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1455 - 1462
  • [16] Placement of High Availability Geo-Distributed Data Centers in Emerging Economies
    Liu, Ruiyun
    Sun, Weiqiang
    Hu, Weisheng
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2023, 11 (03) : 3274 - 3288
  • [17] Time Optimization Modeling for Big Data Placement and Analysis for Geo-Distributed Data Centers
    Khan, Awais
    Attique, Muhammad
    Chung, Tae-Sun
    Kim, Youngjae
    2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 140 - 141
  • [18] Cross-MapReduce: Data transfer reduction in geo-distributed MapReduce
    Marzuni, Saeed Mirpour
    Savadi, Abdorreza
    Toosi, Adel N.
    Naghibzadeh, Mahmoud
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 115 : 188 - 200
  • [19] Privacy Regulation Aware Process Mapping in Geo-Distributed Cloud Data Centers
    Zhou, Amelie Chi
    Xiao, Yao
    Gong, Yifan
    He, Bingsheng
    Zhai, Jidong
    Mao, Rui
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (08) : 1872 - 1888
  • [20] Location-Aware Data Placement for Geo-distributed Online Social Networks
    Zhou, Jingya
    Fan, Jianxi
    Jia, Juncheng
    Cheng, Baolei
    Liu, Zhao
    2016 FOURTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD 2016), 2016, : 234 - 239