Adaptive priority-based data placement and multi-task scheduling in geo-distributed cloud systems

被引:12
|
作者
Li, Chunlin [1 ,2 ,3 ]
Liu, Jun [1 ]
Li, Weigang [2 ]
Luo, Youlong [1 ]
机构
[1] Wuhan Univ Technol, Dept Comp Sci, Wuhan 430063, Peoples R China
[2] Wuhan Univ Sci & Technol, Minist Educ, Engn Res Ctr Met Automat & Measurement Technol, Wuhan 430081, Peoples R China
[3] Shandong Key Lab Intelligent Bldg Technol, Jinan 250101, Peoples R China
基金
中国国家自然科学基金;
关键词
Distributed cloud; Data stream; Spark frame; Multi-task scheduling; REPLICA MANAGEMENT; OPTIMIZATION; RESOURCE; COST; PREDICTION; TASKS;
D O I
10.1016/j.knosys.2021.107050
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid development and the widespread use of cloud computing in various applications, the number of users distributed in different regions has grown exponentially. Therefore, the Geo-distributed cloud systems have become a research hotspot and big data processing technology has also emerged. Nowadays, the most widely used big data processing framework is Spark. However, massive amounts of data are generated every moment, and the processing procedure becomes more and more complex, the execution efficiency of Spark has been greatly affected. In the Spark frame of geo-distributed cloud systems, aiming at the data placement problem, the data placement strategy based on RDD dynamic weight is introduced. The target node is selected with a strong computation capacity to place the data. Aiming at the problems of multi-task scheduling, a task will be scheduled to a node whose computation capacity can satisfy the requirement of this task. And then considering job classification and computing node performance, the optimized task scheduling strategy is in traduced. Experiments show that our algorithms can effectively adjust the weight of node data placement according to the actual task execution information, and shorten the task execution time. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Joint Scheduling of Data and Computation in Geo-distributed Cloud Systems
    Yin, Lingyan
    Sun, Jizhou
    Zhao, Laiping
    Cui, Chenzhou
    Xiao, Jian
    Yu, Ce
    [J]. 2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 657 - 666
  • [2] Scalable and Adaptive Data Replica Placement for Geo-Distributed Cloud Storages
    Liu, Kaiyang
    Peng, Jun
    Wang, Jingrong
    Liu, Weirong
    Huang, Zhiwu
    Pan, Jianping
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (07) : 1575 - 1587
  • [3] Priority-Based Task Scheduling in the Cloud Systems Using a Memetic Algorithm
    Keshanchi, Bahman
    Navimipour, Nima Jafari
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2016, 25 (10)
  • [4] A path priority-based task scheduling algorithm for heterogeneous distributed systems
    Eswari, R.
    Nickolas, S.
    Arock, Michael
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2014, 12 (02) : 183 - 201
  • [5] MapReduce Task Scheduling in Heterogeneous Geo-Distributed Data Centers
    Li, Xiaoping
    Chen, Fuchao
    Ruiz, Ruben
    Zhu, Jie
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (06) : 3317 - 3329
  • [6] TripS: Automated Multi-tiered Data Placement in a Geo-distributed Cloud Environment
    Oh, Kwangsung
    Chandra, Abhishek
    Weissman, Jon
    [J]. SYSTOR'17: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL SYSTEMS AND STORAGE CONFERENCE, 2017,
  • [7] Temporal Task Scheduling for Delay-constrained Applications in Geo-Distributed Cloud Data Centers
    Bi, Jing
    Yuan, Haitao
    Zhang, Jia
    Zhou, MengChu
    [J]. PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2018, : 138 - 145
  • [8] Multi-objective optimization of data deployment and scheduling based on the minimum cost in geo-distributed cloud
    Xie, Tianxing
    Li, Chunlin
    Hao, Na
    Luo, Youlong
    [J]. COMPUTER COMMUNICATIONS, 2022, 185 : 142 - 158
  • [9] Priority-Based Job Scheduling in Distributed Systems
    Bansal, Sunita
    Hota, Chittaranjan
    [J]. INFORMATION SYSTEMS, TECHNOLOGY AND MANAGEMENT-THIRD INTERNATIONAL CONFERENCE, ICISTM 2009, 2009, 31 : 110 - +
  • [10] Genetic Based Data Placement for Geo-Distributed Data-Intensive Applications in Cloud Computing
    Fan, Weifeng
    Peng, Jun
    Zhang, Xiaoyong
    Huang, Zhiwu
    [J]. ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 253 - 265