Implementing Directed Acyclic Graphs with the Heterogeneous System Architecture

被引:9
|
作者
Puthoor, Sooraj [1 ]
Aji, Ashwin M. [1 ]
Che, Shuai [1 ]
Daga, Mayank [1 ]
Wu, Wei [2 ]
Beckmann, Bradford M. [1 ]
Rodgers, Gregory [1 ]
机构
[1] AMD Res, Sunnyvale, CA 94085 USA
[2] Univ Tennessee, Knoxville, TN USA
关键词
D O I
10.1145/2884045.2884052
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Achieving optimal performance on heterogeneous computing systems requires a programming model that supports the execution of asynchronous, multi-stream, and out-of-order tasks in a shared memory environment. Asynchronous dependency-driven tasking is one such programming model that allows the computation to be expressed as a directed acyclic graph (DAG) and exposes fine-grain task management to the programmer. The use of DAGs to extract parallelism also enables runtimes to perform dynamic load-balancing, thereby achieving higher throughput when compared to the traditional bulk-synchronous execution. However, efficient DAG implementations require features such as user-level task dispatch, hardware signalling and local barriers to achieve low-overhead task dispatch and dependency resolution. In this paper, we demonstrate that the Heterogeneous System Architecture (HSA) exposes the above capabilities, and we validate their benefits by implementing three well-referenced applications using fine-grain tasks: Cholesky factorization, Lower Upper Decomposition (LUD), and Needleman-Wunsch (NW). HSA's userlevel task dispatch and signalling capability allow work to be launched and dependencies to be managed directly by the hardware, avoiding inefficient bulk-synchronization. Our results show the HSA task-based implementations of Cholesky, LUD, and NW are representative of this emerging class of workloads and using hardware-managed tasks achieve a speedup of 3.8x, 1.6x, and 1.5x, respectively, compared to bulk-synchronous implementations.
引用
收藏
页码:53 / 62
页数:10
相关论文
共 50 条
  • [1] Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population
    Liu, Jianyu
    Sun, Wei
    Liu, Yufeng
    [J]. BIOMETRICS, 2019, 75 (01) : 36 - 47
  • [2] Counting acyclic orderings in directed acyclic graphs
    Fox, Joseph
    Judd, Aimee
    [J]. Journal of Combinatorial Mathematics and Combinatorial Computing, 2020, 115 : 271 - 286
  • [3] Acyclic Partitioning of Large Directed Acyclic Graphs
    Herrmann, Julien
    Kho, Jonathan
    Ucar, Bora
    Kaya, Kamer
    Catalyurek, Umit V.
    [J]. 2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2017, : 371 - 380
  • [4] Seepage in directed acyclic graphs
    Clarke, N. E.
    Finbow, S.
    Fitzpatrick, S. L.
    Messenger, M. E.
    Nowakowski, R. J.
    [J]. AUSTRALASIAN JOURNAL OF COMBINATORICS, 2009, 43 : 91 - 102
  • [5] ON MERGINGS IN ACYCLIC DIRECTED GRAPHS
    Han, Guangyue
    [J]. SIAM JOURNAL ON DISCRETE MATHEMATICS, 2019, 33 (03) : 1482 - 1502
  • [6] Functional Directed Acyclic Graphs
    Lee, Kuang-Yao
    Li, Lexin
    Li, Bing
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 48
  • [7] Treemaps for directed acyclic graphs
    Tsiaras, Vassilis
    Triantafilou, Sofia
    Tollis, Loannis G.
    [J]. GRAPH DRAWING, 2008, 4875 : 377 - 388
  • [8] Directed Acyclic Graphs With Tears
    Chen, Zhichao
    Ge, Zhiqiang
    [J]. IEEE Transactions on Artificial Intelligence, 2023, 4 (04): : 972 - 983
  • [9] Causal Directed Acyclic Graphs
    Lipsky, Ari M.
    Greenland, Sander
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2022, 327 (11): : 1083 - 1084
  • [10] Copula directed acyclic graphs
    Eugen Pircalabelu
    Gerda Claeskens
    Irène Gijbels
    [J]. Statistics and Computing, 2017, 27 : 55 - 78