CATS: cache-aware task scheduling for Hadoop-based systems

被引:8
|
作者
Lim, Byungnam [1 ]
Kim, Jong Wook [3 ]
Chung, Yon Dohn [2 ]
机构
[1] Korea Univ, Database Studies, Seoul, South Korea
[2] Korea Univ, Dept Comp Sci & Engn, Seoul, South Korea
[3] Sangmyung Univ, Comp Sci, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
Task scheduling; Distributed systems; Hadoop; In-memory;
D O I
10.1007/s10586-017-0920-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Today with the explosion of big data, data-intensive cluster computing systems have driven to a new data processing paradigm. As Hadoop, one of the most famous data processing frameworks, achieves high performance by running multiple tasks in parallel across nodes in large clusters, task scheduling is considered as one of the most important factors affecting the overall performance. In modern operating systems, caching is used to improve local disk access times, providing data from the main memory without disk accesses. This option, however, is poorly utilized by existing task scheduling methods of Hadoop-based systems, mainly due to the inability of tracking cached data in shared-nothing distributed environments. In this paper, we propose a cache-aware task scheduling method, cache-aware task scheduling (CATS), for Hadoop-based systems which is able to exploit the operating system's buffer cache and assign tasks to nodes in consideration of the cached data. Through comprehensive experiments, we show that the proposed cache-aware scheduling improves the overall job execution time for various workload types and data sizes.
引用
收藏
页码:3691 / 3705
页数:15
相关论文
共 50 条
  • [1] CATS: cache-aware task scheduling for Hadoop-based systems
    Byungnam Lim
    Jong Wook Kim
    Yon Dohn Chung
    [J]. Cluster Computing, 2017, 20 : 3691 - 3705
  • [2] Evaluating Task Scheduling in Hadoop-based Cloud Systems
    Liu, Shengyuan
    Xu, Jungang
    Liu, Zongzhen
    Liu, Xu
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [3] Cache-Aware Task Scheduling for Maximizing Control Performance
    Chang, Wanli
    Roy, Debayan
    Hu, Xiaobo Sharon
    Chakraborty, Samarjit
    [J]. PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2018, : 694 - 699
  • [4] A cache-aware scheduling algorithm for embedded systems
    Luculli, G
    Di Natale, M
    [J]. 18TH IEEE REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 1997, : 199 - 209
  • [5] A shared cache-aware Task scheduling strategy for multi-core systems
    Tang, Xiaoyong
    Yang, Xiaopan
    Liao, Guiping
    Zhu, Xinghui
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 1079 - 1088
  • [6] Cache-Aware Task Scheduling on Multi-Core Architecture
    Yang, Teng-Feng
    Lin, Chung-Hsiang
    Yang, Chia-Lin
    [J]. 2010 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN AUTOMATION AND TEST (VLSI-DAT), 2010, : 139 - 142
  • [7] Disk Cache-Aware Task Scheduling For Data-Intensive and Many-Task Workflow
    Tanaka, Masahiro
    Tatebe, Osamu
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2014, : 167 - 175
  • [8] Shared Cache-aware Scheduling Algorithm on Multi-core Systems
    Tang, Xiao-Yong
    [J]. INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMMUNICATION ENGINEERING (CSCE 2015), 2015, : 1249 - 1255
  • [9] Energy-Efficient Cache-Aware Scheduling on Heterogeneous Multicore Systems
    Sheikh, Saad Zia
    Pasha, Muhammad Adeel
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (01) : 206 - 217
  • [10] Cache-Aware Dynamic Classification and Scheduling for Linux
    Gollapudi, Ravi Theja
    Yuksek, Gokturk
    Ghose, Kanad
    [J]. 2019 IEEE SYMPOSIUM IN LOW-POWER AND HIGH-SPEED CHIPS (COOL CHIPS 22), 2019,