Task Scheduling on Manycore Processors with Home Caches

被引:0
|
作者
Muddukrishna, Ananya [1 ]
Podobas, Artur [1 ]
Brorsson, Mats [1 ]
Vlassov, Vladimir [1 ]
机构
[1] KTH Royal Inst Technol, Stockholm, Sweden
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Modern manycore processors feature a highly scalable and software-configurable cache hierarchy. For performance, manycore programmers will not only have to efficiently utilize the large number of cores but also understand and configure the cache hierarchy to suit the application. Relief from this manycore programming nightmare can be provided by task-based programming models where programmers parallelize using tasks and an architecture-specific runtime system maps tasks to cores and in addition configures the cache hierarchy. In this paper, we focus on the cache hierarchy of the Tilera TILEPro64 processor which features a software-configurable coherence waypoint called the home cache. We first show the runtime system performance bottleneck of scheduling tasks oblivious to the nature of home caches. We then demonstrate a technique in which the runtime system controls the assignment of home caches to memory blocks and schedules tasks to minimize home cache access penalties. Test results of our technique have shown a significant execution time performance improvement on selected benchmarks leading to the conclusion that by taking processor architecture features into account, task-based programming models can indeed provide continued performance and allow programmers to smoothly transit from the multicore to manycore era.
引用
收藏
页码:357 / 367
页数:11
相关论文
共 50 条
  • [41] Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors
    Chen, Xi E.
    Aamodt, Tor M.
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (07) : 913 - 927
  • [42] An Efficient Thread Mapping Strategy for Multiprogramming on Manycore Processors
    Tousimojarad, Ashkan
    Vanderbauwhede, Wim
    [J]. PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 63 - 71
  • [43] Customizable fault tolerant caches for embedded processors
    Ramaswamy, Subramanian
    Yalamanchili, Sudhakar
    [J]. PROCEEDINGS 2006 INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2007, : 108 - 113
  • [44] Energy and transition-aware runtime task scheduling for multicore processors
    Shieh, Wann-Yun
    Pong, Chin-Ching
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (09) : 1225 - 1238
  • [45] Battery-aware task scheduling algorithm on DVS enabled processors
    Institute of Microelectronics, Tsinghua University, Beijing 100084, China
    [J]. Qinghua Daxue Xuebao, 2008, 1 (132-136):
  • [46] Scheduling Complete Binary Tree with Constant Cost Task on Heterogeneous Processors
    Troudi, Issam
    Marrakchi, Mounir
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT), 2017, : 797 - 802
  • [47] Scheduling UET task systems with concurrency on two parallel identical processors
    Brucker, P
    Knust, S
    Roper, D
    Zinder, Y
    [J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2000, 52 (03) : 369 - 387
  • [48] A Parallel Quicksort Algorithm on Manycore Processors in Sunway TaihuLight
    Ren, Siyuan
    Xu, Shizhen
    Yang, Guangwen
    [J]. COMPUTATIONAL SCIENCE - ICCS 2018, PT III, 2018, 10862 : 647 - 653
  • [49] A containerized task clustering for scheduling workflows to utilize processors and containers on clouds
    Hidehiro Kanemitsu
    Kenji Kanai
    Jiro Katto
    Hidenori Nakazato
    [J]. The Journal of Supercomputing, 2021, 77 : 12879 - 12923
  • [50] Clustering-Based Task Scheduling in a Large Number of Heterogeneous Processors
    Kanemitsu, Hidehiro
    Hanada, Masaki
    Nakazato, Hidenori
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (11) : 3144 - 3157