TaskStream: Accelerating Task-Parallel Workloads by Recovering Program Structure

被引:10
|
作者
Dadu, Vidushi [1 ]
Nowatzki, Tony [1 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90095 USA
关键词
Irregularity; tasks; load-balance; accelerators; generality; dataflow; reconfigurable; streaming; LOCALITY;
D O I
10.1145/3503222.3507706
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Reconfigurable accelerators, like CGRAs and dataflow architectures, have come to prominence for addressing data-processing problems. However, they are largely limited to workloads with regular parallelism, precluding their applicability to prevalent task-parallel workloads. Reconfigurable architectures and task parallelism seem to be at odds, as the former requires repetitive and simple program structure, and the latter breaks program structure to create small, individually scheduled program units. Our insight is that if tasks and their potential for communication structure are first-class primitives in the hardware, it is possible to recover program structure with extremely low overhead. We propose a task execution model for accelerators called TaskStream, which annotates task dependences with information sufficient to recover inter-task structure. TaskStream enables work-aware load balancing, recovery of pipelined inter-task dependences, and recovery of inter-task read sharing through multicasting. We apply TaskStream to a reconfigurable dataflow architecture, creating a seamless hierarchical dataflow model for task-parallel workloads. We compare our accelerator, Delta, with an equivalent static-parallel design. Overall, we find that our execution model can improve performance by 2.2x with only 3.6% area overhead, while alleviating the programming burden of managing task distribution.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [21] Locality-Aware Task-Parallel Execution on GPUs
    Hbeika, Jad
    Kulkarni, Milind
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, LCPC 2016, 2017, 10136 : 250 - 264
  • [22] Unordered Task-Parallel Augmented Merge Tree Construction
    Werner, Kilian
    Garth, Christoph
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (08) : 3585 - 3596
  • [23] Atos: A Task-Parallel GPU Scheduler for Graph Analytics
    Chen, Yuxin
    Brock, Benjamin
    Porumbescu, Serban
    Buluc, Aydin
    Yelick, Katherine
    Owens, John D.
    [J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [24] Design of a Task-Parallel Version of ILUPACK for Graphics Processors
    Aliaga, Jose I.
    Dufrechou, Ernesto
    Ezzatti, Pablo
    Quintana-Orti, Enrique S.
    [J]. HIGH PERFORMANCE COMPUTING CARLA 2016, 2017, 697 : 91 - 103
  • [25] An Elasticity Description Language for Task-parallel Cloud Applications
    Haussmann, Jens
    Blochinger, Wolfgang
    Kuechlin, Wolfgang
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE (CLOSER), 2020, : 473 - 481
  • [26] Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures
    Nishioka, Yusuke
    Taura, Kenjiro
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1178 - 1184
  • [27] Extracting SIMD Parallelism from Recursive Task-Parallel Programs
    Ren, Bin
    Balakrishna, Shruthi
    Jo, Youngjoon
    Krishnamoorthy, Sriram
    Agrawal, Kunal
    Kulkarni, Milind
    [J]. ACM TRANSACTIONS ON PARALLEL COMPUTING, 2019, 6 (04)
  • [28] Visualization aided performance tuning of irregular task-parallel computations
    Blochinger, Wolfgang
    Kaufmann, Michael
    Siebenhaller, Martin
    [J]. Information Visualization, 2006, 5 (02) : 81 - 94
  • [29] Extending High-Level Synthesis for Task-Parallel Programs
    Chi, Yuze
    Guo, Licheng
    Lau, Jason
    Choi, Young-kyu
    Wang, Jie
    Cong, Jason
    [J]. 2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, : 204 - 213
  • [30] Task-Parallel LU Factorization of Hierarchical Matrices using OmpSs
    Aliaga, Jose I.
    Carratala-Saez, Rocio
    Quintana-Orti, Enrique S.
    Krimann, Ronald
    [J]. 2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 1148 - 1157