Trireme: Exploration of Hierarchical Multi-level Parallelism for Hardware Acceleration

被引：2

作者：

Zacharopoulos, Georgios ^{[1
]}

Ejjeh, Adel ^{[2
]}

Jing, Ying ^{[2
]}

Yang, En-Yu

Jia, Tianyu ^{[1
]}

Brumar, Iulian ^{[1
]}

Intan, Jeremy ^{[2
]}

Huzaifa, Muhammad ^{[2
]}

Adve, Sarita ^{[2
]}

Adve, Vikram ^{[2
]}

Wei, Gu-Yeon ^{[1
]}

Brooks, David ^{[1
]}

机构：

[1] Harvard Univ, POB 121, Cambridge, MA 43017 USA

[2] Univ Illinois, 201 N Goodwin Ave, Champaign, IL USA

来源：

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS | 2023年 / 22卷 / 03期

基金：

瑞士国家科学基金会; 美国国家科学基金会;

关键词：

Accelerators; ASICs; compiler techniques and optimizations; design tools; heterogeneous systems parallelism;

D O I：

10.1145/3580394

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The design of heterogeneous systems that include domain specific accelerators is a challenging and time-consuming process. While taking into account area constraints, designers must decide which parts of an application to accelerate in hardware and which to leave in software. Moreover, applications in domains such as Extended Reality (XR) offer opportunities for various forms of parallel execution, including loop level, task level, and pipeline parallelism. To assist the design process and expose every possible level of parallelism, we present Trireme, a fully automated tool-chain that explores multiple levels of parallelism and produces domain-specific accelerator designs and configurations that maximize performance, given an area budget. FPGA SoCs were used as target platforms, and Catapult HLS [7] was used to synthesize RTL using a commercial 12 nm FinFET technology. Experiments on demanding benchmarks from the XR domain revealed a speedup of up to 20x, as well as a speedup of up to 37x for smaller applications, compared to software-only implementations.

引用

下载

页数：23

共 50 条

[41] LLHD: A Multi-level Intermediate Representation for Hardware Description Languages
Schuiki, Fabian
Kurth, Andreas
Grosser, Tobias
Benini, Luca
PROCEEDINGS OF THE 41ST ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION (PLDI '20), 2020, : 258 - 271
[42] Codevelopment of Multi-Level ISA and Hardware for an Efficient Matrix Processor
Soliman, Mostafa I.
Al-Junaid, Abdulmajid F.
2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES 2009), 2009, : 211 - +
[43] Multi-level attacks: An emerging security concern for cryptographic hardware
Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, West Bengal-721302, India
不详
Proc. Des. Autom. Test Eur. DATE, (1176-1179):
[44] HECTOR: A Multi-level Intermediate Representation for Hardware Synthesis Methodologies
Xu, Ruifan
Xiao, Youwei
Luo, Jin
Liang, Yun
2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
[45] FDTD Acceleration Techniques on Multi-Level Parallel Structure System
Yu, Wenhua
Zhu, Jiahao
Chen, Geng
Zhao, Lei
2019 INTERNATIONAL APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY SYMPOSIUM - CHINA (ACES), VOL 1, 2019,
[46] Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors
Martorell, Xavier
Ayguade, Eduard
Navarro, Nacho
Corbalan, Julita
Gonzalez, Marc
Labarta, Jesus
Proceedings of the International Conference on Supercomputing, 1999, : 294 - 301
[47] HYPPO: A Surrogate-Based Multi-Level Parallelism Tool for Hyperparameter Optimization
Dumont, Vincent
Garner, Casey
Trivedi, Anuradha
Jones, Chelsea
Ganapati, Vidya
Mueller, Juliane
Perciano, Talita
Kiran, Mariam
Day, Marc
PROCEEDINGS OF THE WORKSHOP ON MACHINE LEARNING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS (MLHPC 2021), 2021, : 81 - 93
[48] Exploiting multi-level parallelism for homology search using general purpose processors
Meng, XD
Chaudhary, V
11TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS WORKSHOPS, VOL II, PROCEEDINGS,, 2005, : 331 - 335
[49] Multi-Level Parallelism Analysis of Face Detection on a Shared Memory Multi-Core System
Chiang, Chih-Hsuan
Kao, Chih-Heng
Li, Guan-Ru
Lai, Bo-Cheng Charles
2011 INTERNATIONAL SYMPOSIUM ON VLSI DESIGN, AUTOMATION AND TEST (VLSI-DAT), 2011, : 328 - 331
[50] MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems
Shen, Guan
Zhao, Jieru
Wang, Zeke
Lin, Zhe
Ding, Wenchao
Wu, Chentao
Chen, Quan
Guo, Minyi
2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,

← 1 2 3 4 5 →