PISCES: Optimizing Multi-Job Application Execution in MapReduce

被引:4
|
作者
Chen, Qi [1 ]
Yao, Jinyu [1 ]
Li, Benchao [1 ]
Xiao, Zhen [1 ]
机构
[1] Peking Univ, Dept Comp Sci, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
MapReduce; job dependency; group scheduling; pipeline; OPTIMIZATION;
D O I
10.1109/TCC.2016.2603509
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, many MapReduce applications consist of groups of jobs with dependencies among each other, such as iterative machine learning applications and large database queries. Unfortunately, the MapReduce framework is not optimized for these multi-job applications. It does not explore the execution overlapping opportunities among jobs and can only schedule jobs independently. These issues significantly inflate the application execution time. This paper presents Pipeline Improvement Support with Critical chain Estimation Scheduling (PISCES), a critical chain optimization (a critical chain refers to a series of jobs which will make the application run longer if any one of them is delayed), to provide better support for multi-job applications. PISCES extends the existing MapReduce framework to allow scheduling for multiple jobs with dependencies by dynamically building up a job dependency DAG for current running jobs according to their input and output directories. Then using the dependency DAG, it provides an innovative mechanism to facilitate the data pipelining between the output phase (map phase in the Map-Only job or reduce phase in the Map-Reduce job) of an upstream job and the map phase of a downstream job. This offers a new execution overlapping between dependent jobs in MapReduce which effectively reduces the application runtime. Moreover, PISCES proposes a novel critical chain job scheduling model based on the accurate critical chain estimation. Experiments show that PISCES can increase the degree of system parallelism by up to 68 percent and improve the execution speed of applications by up to 52 percent.
引用
收藏
页码:273 / 286
页数:14
相关论文
共 50 条
  • [31] Design of Multi-job Controlling Mechanism in Customer Order Planning and Scheduling
    Zhang, Xiang
    Wang, Wei
    Ye, Chen
    Wang, Guoxin
    2009 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-4, 2009, : 1714 - +
  • [32] Planning and Monitoring Multi-Job Type Swarm Search and Service Missions
    Chandarana, Meghan
    Hughes, Dana
    Lewis, Michael
    Sycara, Katia
    Scherer, Sebastian
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2021, 101 (03)
  • [33] Multi-Job Intelligent Scheduling With Cross-Device Federated Learning
    Liu, Ji
    Jia, Juncheng
    Ma, Beichen
    Zhou, Chendi
    Zhou, Jingbo
    Zhou, Yang
    Dai, Huaiyu
    Dou, Dejing
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2023, 34 (02) : 535 - 551
  • [34] Planning and Monitoring Multi-Job Type Swarm Search and Service Missions
    Meghan Chandarana
    Dana Hughes
    Michael Lewis
    Katia Sycara
    Sebastian Scherer
    Journal of Intelligent & Robotic Systems, 2021, 101
  • [35] Impact of MapReduce Task Re-execution Policy on Job Completion Reliability and Job Completion Time
    Lin, Jia-Chun
    Leu, Fang-Yie
    Chen, Ying-ping
    Munawar, Waqaas
    2014 IEEE 28TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA), 2014, : 712 - 718
  • [36] A Multi-Input File Data Symmetry Placement Method Considering Job Execution Frequency for MapReduce Join Operation
    Wu, Jia-Xuan
    Zhang, Yu-Zhu
    Jiang, Yue-Qiu
    Zhang, Xin
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (15)
  • [37] Optimizing Power and Performance Trade-offs of MapReduce Job Processing with Heterogeneous Multi-Core Processors
    Yan, Feng
    Cherkasova, Ludmila
    Zhang, Zhuoyao
    Smirni, Evgenia
    2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 240 - 247
  • [38] Sampling-Based Multi-Job Placement for Heterogeneous Deep Learning Clusters
    Liu, Kaiyang
    Wang, Jingrong
    Huang, Zhiming
    Pan, Jianping
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (06) : 874 - 888
  • [39] IMI: In-memory Multi-job Inference Acceleration for Large Language Models
    Gao, Bin
    Wang, Zhehui
    He, Zhuomin
    Luo, Tao
    Wong, Weng-Fai
    Zhou, Zhi
    53RD INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2024, 2024, : 752 - 761
  • [40] Optimizing schedules in the pipeline systems with variable job execution order
    Levin, VI
    AUTOMATION AND REMOTE CONTROL, 2005, 66 (03) : 406 - 421