Design Optimization for High-Performance Computing Using FPGA

被引：0

作者：

Isik, Murat ^{[1
]}

Inadagbo, Kayode ^{[2
]}

Aktas, Hakan ^{[3
]}

机构：

[1] Drexel Univ, Elect & Comp Engn Dept, Philadelphia, PA 19104 USA

[2] A&M Univ, Elect & Comp Engn Dept, Prairie View, TX USA

[3] Omer Halisdemir Univ, Comp Engn Dept, Nigde, Turkiye

来源：

INFORMATION MANAGEMENT AND BIG DATA, SIMBIG 2023 | 2024年 / 2142卷

关键词：

High-performance computing; Tensil AI; Design optimization; FPGA; Open-source inference accelerator;

D O I：

10.1007/978-3-031-63616-5_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimize Tensil AI's open-source inference accelerator for maximum performance using ResNet20 trained on CIFAR in this paper in order to gain insight into the use of FPGAs for high-performance computing. In this paper, we show how improving hardware design, using Xilinx Ultra RAM, and using advanced compiler strategies can lead to improved inference performance. We also demonstrate that running the CIFAR test data set shows very little accuracy drop when rounding down from the original 32bit floating point. The heterogeneous computing model in our platform allows us to achieve a frame rate of 293.58 frames per second (FPS) and a %90 accuracy on a ResNet20 trained using CIFAR. The experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.21W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency.

引用

页码：142 / 156

页数：15

共 50 条

[1] OPTIMIZATION OF VENTRICULAR CATHETER DESIGN USING HIGH-PERFORMANCE COMPUTING
Weisenberg, Sofy H.
TerMaath, Stephanie C.
PROCEEDINGS OF THE ASME FLUIDS ENGINEERING DIVISION SUMMER MEETING, 2016, VOL 1A, 2016,
[2] Design and implementation of dynamic and partial reconfigurable high-performance computing using FPGA
Zhang, Xingjun
Ding, Yanfei
Huang, Yiyuan
Dong, Xiaoshe
Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2010, 38 (SUPPL. 1): : 82 - 86
[3] Progressive collapse design optimization of RC frame structures using high-performance computing
Lin, Kaiqi
Wu, Zewei
Zhu, Yaqiong
Zheng, Junhao
Li, Yi
Lu, Xinzheng
STRUCTURES, 2023, 50 : 823 - 834
[4] ADD: Accelerator Design and Deploy - A tool for FPGA high-performance dataflow computing
Penha, Jeronimo C.
Silva, Lucas B.
Silva, Jansen M.
Coelho, Kristtopher K.
Baranda, Hector P.
Nacif, Jose Augusto M.
Ferreira, Ricardo S.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (18):
[5] The FPGA High-Performance Computing Alliance Parallel Toolkit
Baxter, Rob
Booth, Stephen
Bull, Mark
Cawood, Geoff
Perry, James
Parsons, Mark
Simpson, Alan
Trew, Arthur
McCormick, Andrew
Smart, Graham
Smart, Ronnie
Cantle, Allan
Chamberlain, Richard
Genest, Gildas
NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS, PROCEEDINGS, 2007, : 301 - +
[6] Optimization design for parallel coloring of a set of graphs in the High-Performance Computing
Dudas, Adam
Skrinarova, Jarmila
Vesel, Eduard
2019 IEEE 15TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATICS (INFORMATICS 2019), 2019, : 11 - 17
[7] High-performance computing in structural design
Li, Yungui
Jianzhu Jiegou Xuebao/Journal of Building Structures, 2010, 31 (06): : 89 - 95
[8] High-Performance Computing for Drug Design
Yang, Mary Qu
Yang, Jack Y.
2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, PROCEEDINGS, 2008, : 120 - 120
[9] A HIGH-PERFORMANCE DISTRIBUTED COMPUTING FRAMEWORK FOR PARAMETRIC DESIGN OPTIMIZATION OF RF DEVICES
Stantchev, George M.
Cooke, Simon J.
Petillo, John J.
Ovtchinnikov, Serguei
Burke, Alex
Kostas, Chris
Panagos, Dimitrios
Antonsen, Thomas M., Jr.
2016 43RD IEEE INTERNATIONAL CONFERENCE ON PLASMA SCIENCE (ICOPS), 2016,
[10] Design and Performance Measurement of a High-Performance Computing Cluster
George, Kiran
Venugopal, Vivek
2012 IEEE INTERNATIONAL INSTRUMENTATION AND MEASUREMENT TECHNOLOGY CONFERENCE (I2MTC), 2012, : 2531 - 2536

← 1 2 3 4 5 →