Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments

被引:0
|
作者
Makatun, Dzmitry [1 ,3 ]
Lauret, Jerome [2 ]
Rudova, Hana [4 ]
Sumbera, Michal [3 ]
机构
[1] Czech Tech Univ, Fac Nucl Phys & Phys Engn, CR-16635 Prague, Czech Republic
[2] Brookhaven Natl Lab, STAR, Upton, NY 11973 USA
[3] Acad Sci Czech Republ, Nucl Phys Inst, Prague, Czech Republic
[4] Masaryk Univ, CS-60177 Brno, Czech Republic
关键词
D O I
10.1088/1742-6596/608/1/012028A
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
When running data intensive applications on distributed computational resources long I/O overheads may be observed as access to remotely stored data is performed. Latencies and bandwidth can become the major limiting factor for the overall computation performance and can reduce the CPU/WallTime ratio to excessive TO wait. Reusing the knowledge of our previous research, we propose a constraint programming based planner that schedules computational jobs and data placements (transfers) in a distributed environment in order to optimize resource utilization and reduce the overall processing completion time. The optimization is achieved by ensuring that none of the resources (network links, data storages and CPUs) are oversaturated at any moment of time and either (a) that the data is pre-placed at the site where the job runs or (b) that the jobs are scheduled where the data is already present. Such an approach eliminates the idle CPU cycles occurring when the job is waiting for the I/O from a remote site and would have wide application in the community. Our planner was evaluated and simulated based on data extracted from log files of batch and data management systems of the STAR experiment. The results of evaluation and estimation of performance improvements are discussed in this paper.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Distributed Service-Based Approach for Sensor Data Fusion in IoT Environments
    Rodriguez-Valenzuela, Sandra
    Holgado-Terriza, Juan A.
    Gutierrez-Guerrero, Jose M.
    Muros-Cobos, Jesus L.
    SENSORS, 2014, 14 (10) : 19200 - 19228
  • [42] Data Locality Aware Algorithm for Task Execution on Distributed, Cloud Based Environments
    Bica, Mihai
    Gorgan, Dorian
    COMPLEX, INTELLIGENT, AND SOFTWARE INTENSIVE SYSTEMS, CISIS-2017, 2018, 611 : 557 - 566
  • [43] Improving scientists' interaction with complex computational-visualization environments based on a distributed grid infrastructure
    Kalawsky, RS
    O'Brien, J
    Coveney, PV
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2005, 363 (1833): : 1867 - 1884
  • [44] A Framework of Hypergraph-Based Data Placement Among Geo-Distributed Datacenters
    Yu, Boyang
    Pan, Jianping
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2020, 13 (03) : 395 - 409
  • [45] Model and algorithm for sensor placement based on vector map data in distributed sensor networks
    ATR Lab., Coll. of Information and Engineering, Shenzhen Univ., Shenzhen 518060, China
    Xi Tong Cheng Yu Dian Zi Ji Shu/Syst Eng Electron, 2008, 5 (944-948): : 944 - 948
  • [46] Distributed Clustering Approach for Wireless Sensor Network based Cellular Data Placement Model
    Gupta, Sanjeev
    Dave, Mayank
    2010 IEEE 2ND INTERNATIONAL ADVANCE COMPUTING CONFERENCE, 2010, : 357 - +
  • [47] A novel distributed dynamic planning approach based on constraint satisfaction (vol 14, pg 243, 2018)
    El Houda, Dehimi Nour
    Tahar, Guerram
    Zakaria, Tolba
    Farid, Mokhati
    MULTIAGENT AND GRID SYSTEMS, 2018, 14 (04) : 419 - 419
  • [48] High Proportion of Distributed PV Reliability Planning Method Based on Big Data
    Fang, Hualiang
    Shang, Lei
    Dong, Xuzhu
    Tian, Ye
    ENERGIES, 2023, 16 (23)
  • [49] Genetic Based Data Placement for Geo-Distributed Data-Intensive Applications in Cloud Computing
    Fan, Weifeng
    Peng, Jun
    Zhang, Xiaoyong
    Huang, Zhiwu
    ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 253 - 265
  • [50] A Web-Based Data Architecture for Problem-Solving Environments: Application of Distributed Authoring and Versioning to the Extensible Computational Chemistry Environment
    K.L. Schuchardt
    J.D. Myers
    E.G. Stephan
    Cluster Computing, 2002, 5 (3) : 287 - 296