JHTD: An Efficient Joint Scheduling Framework Based on Hypergraph for Task Placement and Data Transfer Across Geographically Distributed Data Centers

被引:0
|
作者
Jing, Chao [1 ,2 ,3 ]
Dan, Penggao [1 ,2 ]
机构
[1] Guilin Univ Technol, Sch Informat Sci & Engn, Guilin, Peoples R China
[2] Guilin Univ Technol, Guangxi Key Lab Embedded Technol & Intelligent Sy, Guilin 541004, Peoples R China
[3] Guilin Univ Elect & Technol, Guangxi Key Lab Trusted Software, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Big data processing; geographically distributed data centers; joint scheduling framework; hypergraph; task placement; data transferring; NETWORKS; AWARE; JOBS;
D O I
10.1109/ACCESS.2022.3219873
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the explosive growth of the data volume, data center is playing a critical role to store and process huge amount of data. Traditional single data center can no longer to adapt into incredibly fast-growing data. Recently, some researches have extended the tasks such data processing to geographically distributed data centers. However, since the joint consideration of task placement and data transfer, it is complex and difficult to design a proper scheduling approach with the goal of minimizing makespan under the constraint of task dependencies, processing capability and network, etc. Therefore, our work proposes JHTD: an efficient joint scheduling framework based on hypergraph for task placement and data transfer across geographically distributed data centers. Generally, there are two crucial stages in JHTD. Initially, due to the outstanding of hypergraphs in modeling complex problems, we have leveraged a hypergraph-based model to establish the relationship between tasks, data files, and data centers. Thereafter, a hypergraph-based partition method has been developed for task placement within the first stage. In the second stage, a task reallocation scheme has been devised in terms of each task-to-data dependency. Meanwhile, a data dependency aware transferring scheme has been designed to minimize the makespan. Last, the real-world model China-VO project has been used to conduct a variety of simulation experiments. The results have demonstrated that JHTD effectively optimizes the problems of task placement and data transfer across geographically distributed data centers. JHTD has been compared with three other state-of-the-art algorithms. The results have demonstrated that JHTD can reduce the makespan by up to 20.6%. Also, various impacts (data transfer volume and load balancing) have been taken into account to show and discuss the effectiveness of JHTD.
引用
收藏
页码:116302 / 116316
页数:15
相关论文
共 50 条
  • [1] An Efficient Scheduling of HPC Applications on Geographically Distributed Cloud Data Centers
    Rajabi, Aboozar
    Faragardi, Hamid Reza
    Nolte, Thomas
    [J]. COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS, CNDS 2013, 2014, 428 : 155 - 167
  • [2] Dynamic data replication and placement strategy in geographically distributed data centers
    Bouhouch, Laila
    Zbakh, Mostapha
    Tadonki, Claude
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (14):
  • [3] Time- and Cost- Efficient Task Scheduling across Geo-Distributed Data Centers
    Hu, Zhiming
    Li, Baochun
    Luo, Jun
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (03) : 705 - 718
  • [4] Provably-Efficient Job Scheduling for Energy and Fairness in Geographically Distributed Data Centers
    Ren, Shaolei
    He, Yuxiong
    Xu, Fei
    [J]. 2012 IEEE 32ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2012, : 22 - 31
  • [5] GreenCloudNet plus plus : Simulation framework for energy efficient and secure, green job scheduling in geographically distributed data centers
    Mahmood, Farrukh
    Khan, Farrukh Zeeshan
    Ahmed, Muneer
    Ahmad, Iftikhar
    Gupta, Brij B.
    [J]. TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2022, 33 (04)
  • [6] A metaheuristic method for joint task scheduling and virtual machine placement in cloud data centers
    Alboaneen, Dabiah
    Tianfield, Hugo
    Zhang, Yan
    Pranggono, Bernardi
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 115 : 201 - 212
  • [7] Task scheduling of massive spatial data processing across distributed data centers: what's new?
    Song, Weijing
    Yue, Shasha
    Wang, Lizhe
    Zhang, Wanfeng
    Liu, Dingsheng
    [J]. 2011 IEEE 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2011, : 976 - 981
  • [8] An electricity price and energy-efficient workflow scheduling in geographically distributed cloud data centers
    Hussain, Mehboob
    Wei, Lian-Fu
    Rehman, Amir
    Hussain, Abid
    Ali, Muqadar
    Javed, Muhammad Hafeez
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (08)
  • [9] Biobjective Task Scheduling for Distributed Green Data Centers
    Yuan, Haitao
    Bi, Jing
    Zhou, MengChu
    Liu, Qing
    Ammari, Ahmed Chiheb
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 18 (02) : 731 - 742
  • [10] Green-Aware Workload Scheduling in Geographically Distributed Data Centers
    Chen, Changbing
    He, Bingsheng
    Tang, Xueyan
    [J]. 2012 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2012,