Intelligent Scheduling for Parallel Jobs in Big Data Processing Systems

被引:0
|
作者
Xu, Mingrui [1 ]
Wu, Chase Q. [1 ,2 ]
Hou, Aiqin [1 ]
Wang, Yongqiang [1 ]
机构
[1] Northwest Univ, Sch Informat Sci & Technol, Xian 710127, Shaanxi, Peoples R China
[2] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
基金
美国国家科学基金会;
关键词
Task scheduling; heterogeneous clusters; big data platform; cluster manager;
D O I
10.1109/iccnc.2019.8685520
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The explosive growth of data in various scientific, industrial, and business domains necessitates the use of big data processing systems, such as Iladoop, which are typically deployed in a physical or cloud-based cluster shared by many users running parallel jobs. As the user population and application scale increase, such systems are expanded from time to time with an addition of new nodes of different types, making the cluster highly heterogeneous. Job scheduling in such systems largely determines the performance of big data applications and remains to be a challenging problem. In this paper, we formulate a generic job scheduling problem for parallel processing of big data in heterogeneous clusters and design a k-means based task scheduling algorithm, referred to as KNITS. Simulation results show that KMTS improves execution performance by 25% and 30% on average in single job scheduling and parallel job scheduling, respectively, over existing methods. The performance superiority is also confirmed by real experiments in high-performance computing environments.
引用
收藏
页码:22 / 28
页数:7
相关论文
共 50 条
  • [1] Multi-objective scheduling of MapReduce jobs in big data processing
    Hashem, Ibrahim Abaker Targio
    Anuar, Nor Badrul
    Marjani, Mohsen
    Gani, Abdullah
    Sangaiah, Arun Kumar
    Sakariyah, Adewole Kayode
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (08) : 9979 - 9994
  • [2] Multi-objective scheduling of MapReduce jobs in big data processing
    Ibrahim Abaker Targio Hashem
    Nor Badrul Anuar
    Mohsen Marjani
    Abdullah Gani
    Arun Kumar Sangaiah
    Adewole Kayode Sakariyah
    [J]. Multimedia Tools and Applications, 2018, 77 : 9979 - 9994
  • [3] Parallel Processing Systems for Big Data: A Survey
    Zhang, Yunquan
    Cao, Ting
    Li, Shigang
    Tian, Xinhui
    Yuan, Liang
    Jia, Haipeng
    Vasilakos, Athanasios V.
    [J]. PROCEEDINGS OF THE IEEE, 2016, 104 (11) : 2114 - 2136
  • [4] Scheduling Jobs on Parallel Batch Processing Machines
    Liu, Lili
    Wang, Jibo
    Zhang, Feng
    [J]. 2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL I, 2009, : 78 - +
  • [5] An intelligent efficient scheduling algorithm for big data in communication systems
    Bu, Fanyu
    [J]. INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2018, 31 (16)
  • [6] Brief Announcement: Deadline-Aware Scheduling of Big-Data Processing Jobs
    Bodik, Peter
    Menache, Ishai
    Naor, Joseph
    Yaniv, Jonathan
    [J]. PROCEEDINGS OF THE 26TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA'14), 2014, : 211 - 213
  • [7] Intelligent big data processing
    Hsu, Ching-Hsien
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING AND ESCIENCE, 2014, 36 : 16 - 18
  • [8] SCHEDULING JOBS WITH EXPONENTIAL PROCESSING TIMES ON PARALLEL MACHINES
    LEHTONEN, T
    [J]. JOURNAL OF APPLIED PROBABILITY, 1988, 25 (04) : 752 - 762
  • [9] Efficient jobs scheduling approach for big data applications
    Shao, Yanling
    Li, Chunlin
    Gu, Jinguang
    Zhang, Jing
    Luo, Youlong
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2018, 117 : 249 - 261
  • [10] Cost optimization for deadline-aware scheduling of big-data processing jobs on clouds
    Zheng, Wei
    Qin, Yingsheng
    Emmanuel, Bugingo
    Zhang, Dongzhan
    Chen, Jinjun
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 82 : 244 - 255