Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments

被引:0
|
作者
Makatun, Dzmitry [1 ,3 ]
Lauret, Jerome [2 ]
Rudova, Hana [4 ]
Sumbera, Michal [3 ]
机构
[1] Czech Tech Univ, Fac Nucl Phys & Phys Engn, CR-16635 Prague, Czech Republic
[2] Brookhaven Natl Lab, STAR, Upton, NY 11973 USA
[3] Acad Sci Czech Republ, Nucl Phys Inst, Prague, Czech Republic
[4] Masaryk Univ, CS-60177 Brno, Czech Republic
关键词
D O I
10.1088/1742-6596/608/1/012028A
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
When running data intensive applications on distributed computational resources long I/O overheads may be observed as access to remotely stored data is performed. Latencies and bandwidth can become the major limiting factor for the overall computation performance and can reduce the CPU/WallTime ratio to excessive TO wait. Reusing the knowledge of our previous research, we propose a constraint programming based planner that schedules computational jobs and data placements (transfers) in a distributed environment in order to optimize resource utilization and reduce the overall processing completion time. The optimization is achieved by ensuring that none of the resources (network links, data storages and CPUs) are oversaturated at any moment of time and either (a) that the data is pre-placed at the site where the job runs or (b) that the jobs are scheduled where the data is already present. Such an approach eliminates the idle CPU cycles occurring when the job is waiting for the I/O from a remote site and would have wide application in the community. Our planner was evaluated and simulated based on data extracted from log files of batch and data management systems of the STAR experiment. The results of evaluation and estimation of performance improvements are discussed in this paper.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] ADSL: An Embedded Domain-Specific Language for Constraint-Based Distributed Self-Management
    Chhetri, Mohan Baruwal
    Hien Luong
    Uzunov, Anton V.
    Quoc Bao Vo
    Kowalczyk, Ryszard
    Nepal, Surya
    Rajapakse, Isuru
    2018 25TH AUSTRALASIAN SOFTWARE ENGINEERING CONFERENCE (ASWEC), 2018, : 101 - 110
  • [32] Fuzzy Theory-Based Data Placement for Scientific Workflows in Hybrid Cloud Environments
    Chen, Zheyi
    Zhao, Xu
    Lin, Bing
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2020, 2020
  • [33] Distributed Virtual Machine Placement based on Dependability in Data Centers
    Yin, Luxiu
    He, Wenfeng
    Luo, Juan
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 2152 - 2158
  • [34] Structural design of distributed energy networks by a hierarchical combination of variable- and constraint-based decomposition methods
    Wakui, Tetsuya
    Hashiguchi, Moe
    Yokoyama, Ryohei
    ENERGY, 2021, 224
  • [35] An Agent-Based Computational Framework for Distributed Data Analysis
    Fukuda, Munehiro
    Gordon, Collin
    Mert, Utku
    Sell, Matthew
    COMPUTER, 2020, 53 (03) : 16 - 25
  • [36] A Network Performance Based Data Placement Policy in Distributed Data-Intensive Applications
    Xu, Dawei
    Miao, Xianglin
    Hu, Peng
    Luan, Zhongzhi
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2014, : 795 - 800
  • [37] SSC: Cloud-based data stream management in distributed environments
    Ong K.-L.
    Goscinski A.
    Han Y.
    Brezany P.
    Tari Z.
    Yan L.
    International Journal of High Performance Computing and Networking, 2016, 9 (03) : 171 - 189
  • [38] Scientific Workflows in IoT Environments: A Data Placement Strategy Based on Heterogeneous Edge-Cloud Computing
    Du, Xin
    Tang, Songtao
    Lu, Zhihui
    Gai, Keke
    Wu, Jie
    Hung, Patrick C. K.
    ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2022, 13 (04)
  • [39] Distributed Generation of Distribution Networks in Network Planning Based on Large Data
    Jing, Xiaorui
    Zhang, Min
    INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 365 - 368
  • [40] Distributed Intrusion Detection System for Cloud Environments based on Data Mining techniques
    Idhammad, Mohamed
    Afdel, Karim
    Belouch, Mustapha
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS2017), 2018, 127 : 35 - 41