Scalable Data Placement of Data-intensive Services in Geo-distributed Clouds

被引:4
|
作者
Atrey, Ankita [1 ]
Van Seghbroeck, Gregory [1 ]
Volckaert, Bruno [1 ]
De Turck, Filip [1 ]
机构
[1] UGent, IDLAB Imec, Technol Pk, Ghent, Belgium
关键词
Data Placement; Geo-distributed Clouds; Location-based Services; Online Social Networks; Scalability; Spectral Clustering; Hypergraphs; Approximation;
D O I
10.5220/0006767504970508
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The advent of big data analytics and cloud computing technologies has resulted in wide-spread research in finding solutions to the data placement problem, which aims at properly placing the data items into distributed datacenters. Although traditional schemes of uniformly partitioning the data into distributed nodes is the defacto standard for many popular distributed data stores like HDFS or Cassandra, these methods may cause network congestion for data-intensive services, thereby affecting the system throughput. This is because as opposed to MapReduce style workloads, data-intensive services require access to multiple datasets within each transaction. In this paper, we propose a scalable method for performing data placement of data-intensive services into geographically distributed clouds. The proposed algorithm partitions a set of data-items into geo-distributed clouds using spectral clustering on hypergraphs. Additionally, our spectral clustering algorithm leverages randomized techniques for obtaining low-rank approximations of the hypergraph matrix, thereby facilitating superior scalability for computation of the spectra of the hypergraph laplacian. Experiments on a real-world trace-based online social network dataset show that the proposed algorithm is effective, efficient, and scalable. Empirically, it is comparable or even better (in certain scenarios) in efficacy on the evaluated metrics, while being up to 10 times faster in running time when compared to state-of-the-art techniques.
引用
收藏
页码:497 / 508
页数:12
相关论文
共 50 条
  • [1] SpeCH: A scalable framework for data placement of data-intensive services in geo-distributed clouds
    Atrey, Ankita
    Van Seghbroeck, Gregory
    Mora, Higinio
    De Turck, Filip
    Volckaert, Bruno
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2019, 142 : 1 - 14
  • [2] Location-aware Associated Data Placement for Geo-distributed Data-intensive Applications
    Yu, Boyang
    Pan, Jianping
    [J]. 2015 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (INFOCOM), 2015,
  • [3] Genetic Based Data Placement for Geo-Distributed Data-Intensive Applications in Cloud Computing
    Fan, Weifeng
    Peng, Jun
    Zhang, Xiaoyong
    Huang, Zhiwu
    [J]. ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 253 - 265
  • [4] Unifying Data and Replica Placement for Data-intensive Services in Geographically Distributed Clouds
    Atrey, Ankita
    Van Seghbroeck, Gregory
    Mora, Higinio
    De Turck, Filip
    Volckaert, Bruno
    [J]. CLOSER: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2019, : 25 - 36
  • [5] Efficient Location-Aware Data Placement for Data-Intensive Applications in Geo-distributed Scientific Data Centers
    Zhang, Jinghui
    Chen, Jian
    Luo, Junzhou
    Song, Aibo
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2016, 21 (05) : 471 - 481
  • [6] Efficient Location-Aware Data Placement for Data-Intensive Applications in Geo-distributed Scientific Data Centers
    Jinghui Zhang
    Jian Chen
    Junzhou Luo
    Aibo Song
    [J]. Tsinghua Science and Technology, 2016, 21 (05) : 471 - 481
  • [7] Scalable and Adaptive Data Replica Placement for Geo-Distributed Cloud Storages
    Liu, Kaiyang
    Peng, Jun
    Wang, Jingrong
    Liu, Weirong
    Huang, Zhiwu
    Pan, Jianping
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (07) : 1575 - 1587
  • [8] Data-intensive workflow management: For clouds and data-intensive and scalable computing environments
    De Oliveira, Daniel C.M.
    Liu, Ji
    Pacitti, Esther
    [J]. Synthesis Lectures on Data Management, 2019, 14 (04): : 1 - 179
  • [9] Interacting Data-Intensive Services Mining and Placement in Mobile Edge Clouds
    Huang, Yuze
    Huang, Jiwei
    Cheng, Bo
    Yao, Tianxiang
    Chen, Junliang
    [J]. PROCEEDINGS OF THE 23RD ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING (MOBICOM '17), 2017, : 558 - 560
  • [10] Awan: Locality-aware Resource Manager for Geo-distributed Data-intensive Applications
    Jonathan, Albert
    Chandra, Abhishek
    Weissman, Jon
    [J]. PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING (IC2E), 2016, : 32 - 41