Dynamic data replication and placement strategy in geographically distributed data centers

被引:4
|
作者
Bouhouch, Laila [1 ]
Zbakh, Mostapha [1 ]
Tadonki, Claude [2 ]
机构
[1] Mohammed V Univ Rabat, Natl Sch Comp Sci & Syst Anal, Rabat, Morocco
[2] MINES ParisTech PSL CRI, Paris, France
来源
关键词
big data; cloud computing; Cloudsim; data placement; dynamic data replication; CLOUD; OPPORTUNITIES;
D O I
10.1002/cpe.6858
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
With the evolution of geographically distributed data centers in the Cloud Computing landscape along with the amount of data being processed in these data centers, which is growing at an exponential rate, processing massive data applications become an important topic. Since a given task may require many datasets for its execution and the datasets are spread over several different data centers, finding an efficient way to manage the datasets storage across nodes of a Cloud system is a difficult problem. In fact, the execution time of a task might be influenced by the cost of data transfers, which mainly depends on two criterias. The first one is the initial placement of the input datasets during the build-time phase, while the second is the replication of the datasets during the runtime phase. The replication is explicitly considered when datasets are being migrated over the data centers in order to make them locally available wherever needed. Data placement and data replication are important challenges in Cloud Computing. Nevertheless, many studies focus on data placement or data replication exclusively. In this paper, a combination of a data placement strategy followed by a dynamic data replication management strategy is proposed, with the purpose of reducing the associated cost of all data transfers between the (distant) data centers. Our proposed data placement approach considers the main characteristics of a data center such as storage capacity and read/write speeds to efficiently store the datasets, while our dynamic data replication management approach considers three parameters: the number of replicas in the system, the dependency between datasets and tasks and the storage capacity of data centers. The decision of when and whether to keep or to delete replicas is determined by the fulfillment of those three parameters. Our approach estimates the total execution time of the tasks as well as the monetary cost, considering the data transfers activity. Our experiments are conducted using Cloudsim simulator. The obtained results show that our proposed strategies produce an efficient data management by reducing the overheads of the data transfers, compared to both a data placement without replication (by 76%) and the selected data replication approach from Kouidri et al. (by 52%), and by improving the financial cost.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Optimal Dynamic Placement of Virtual Machines in Geographically Distributed Cloud Data Centers
    Teyeb, Hana
    Ben Hadj-Alouane, Nejib
    Tata, Samir
    Balma, Ali
    [J]. INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2017, 26 (03)
  • [2] A Big Data Placement Strategy in Geographically Distributed Datacenters
    Bouhouch, Laila
    Zbakh, Mostapha
    Tadonki, Claude
    [J]. PROCEEDINGS OF 2020 5TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND ARTIFICIAL INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS (CLOUDTECH'20), 2020, : 190 - 198
  • [3] Dynamic VM Placement Method for Minimizing Energy and Carbon Cost in Geographically Distributed Cloud Data Centers
    Khosravi, Atefeh ko
    Andrew, Lachlan L. H.
    Buyya, Rajkumar
    [J]. IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2017, 2 (02): : 183 - 196
  • [4] The cloud of geographically distributed data centers
    Fedchenkov, Petr
    Shevel, Andrey
    Khoruzhnikov, Sergey
    Sadov, Oleg
    Lazo, Oleg
    Samokhin, Nikitta
    [J]. 23RD INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2018), 2019, 214
  • [5] Energy-aware coordinated operation strategy of geographically distributed data centers
    Zhou, Shibo
    Zhou, Ming
    Wu, Zhaoyuan
    Wang, Yuyang
    Li, Gengyin
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2024, 159
  • [6] Flexible Network Architecture and Provisioning Strategy for Geographically Distributed Metro Data Centers
    Fiorani, Matteo
    Samadi, Payman
    Shen, Yiwen
    Wosinska, Lena
    Bergman, Keren
    [J]. JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING, 2017, 9 (05) : 385 - 392
  • [7] Energy and carbon-aware initial VM placement in geographically distributed cloud data centers
    Khodayarseresht, Ehsan
    Shameli-Sendi, Alireza
    Fournier, Quentin
    Dagenais, Michel
    [J]. SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS, 2023, 39
  • [8] Differentiated Replication Strategy in Data Centers
    Tung Nguyen
    Cutway, Anthony
    Shi, Weisong
    [J]. NETWORK AND PARALLEL COMPUTING, 2010, 6289 : 277 - 288
  • [9] Optimal data placement strategy considering capacity limitation and load balancing in geographically distributed cloud
    Li, Chunlin
    Cai, Qianqian
    Youlong, Lou
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 127 : 142 - 159
  • [10] GreenPacker: renewable- and fragmentation-aware VM placement for geographically distributed green data centers
    Zeinab Nadalizadeh
    Mahmoud Momtazpour
    [J]. The Journal of Supercomputing, 2022, 78 : 1434 - 1457