Embedded GPU Cluster Computing Framework for Inference of Convolutional Neural Networks

被引：0

作者：

Kain, Evan ^{[1
]}

Wildenstein, Diego ^{[2
]}

Pineda, Andrew C. ^{[3
]}

机构：

[1] Univ Pittsburgh, Dept Elect & Comp Engn, NSF Ctr Space Highperformance & Resilient Comp SH, Pittsburgh, PA 15260 USA

[2] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ USA

[3] US Air Force, Spacecraft Component Technol Branch, Space Vehicles Directorate, Res Lab, Kirtland AFB, NM USA

来源：

2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

The growing need for on-board image processing for space vehicles requires computing solutions that are both low-power and high-performance. Parallel computation using low-power embedded Graphics Processing Units (GPUs) satisfy both requirements. Our experiment involves the use of OpenMPI domain decomposition of an image processing algorithm based upon a pre-trained convolutional neural network (CNN) developed by the U.S. Air Force Research Laboratory (AFRL). Our testbed consists of six NVIDIA Jetson TX2 development boards operating in parallel. This parallel framework results in a speedup of 4.3x on six processing nodes. This approach also leads to a linear decay in parallel efficiency as more processing nodes are added to the network. By replicating the data across processors in addition to distributing, we also characterize the best-case impact of adding triple modular redundancy (TMR) to our application.

引用

页数：7

共 50 条

[21] Computing nasalance with MFCCs and Convolutional Neural Networks
Lozano, Andres
Nava, Enrique
Garcia Mendez, Maria Dolores
Moreno-Torres, Ignacio
PLOS ONE, 2024, 19 (12):
[22] Computing receptive fields of convolutional neural networks
Araujo, André
Norris, Wade
Sim, Jack
Distill, 2019, 4 (11):
[23] Hartley Stochastic Computing For Convolutional Neural Networks
Mozafari, S. H.
Clark, J. J.
Gross, W. J.
Meyer, B. H.
2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 235 - 240
[24] DSIP: A Scalable Inference Accelerator for Convolutional Neural Networks
Jo, Jihyuck
Cha, Soyoung
Rho, Dayoung
Park, In-Cheol
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (02) : 605 - 618
[25] Binarized Convolutional Neural Networks for Efficient Inference on GPUs
Khan, Mir
Huttunen, Heikki
Boutellier, Jani
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 682 - 686
[26] Convolutional Neural Networks for Valid and Efficient Causal Inference
Ghasempour, Mohammad
Moosavi, Niloofar
de Luna, Xavier
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2024, 33 (02) : 714 - 723
[27] Performance and Scalability of GPU-based Convolutional Neural Networks
Strigl, Daniel
Kofler, Klaus
Podlipnig, Stefan
PROCEEDINGS OF THE 18TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2010, : 317 - 324
[28] Improving Performance of Convolutional Neural Networks by Separable Filters on GPU
Kang, Hao-Ping
Lee, Che-Rung
EURO-PAR 2015: PARALLEL PROCESSING, 2015, 9233 : 638 - 649
[29] Efficient GPU implementation of convolutional neural networks for speech recognition
van den Berg, Ewout
Brand, Daniel
Bordawekar, Rajesh
Rachevsky, Leonid
Ramabhadran, Bhuvana
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1483 - 1487
[30] Recurrent Neural Networks: An Embedded Computing Perspective
Rezk, Nesma M.
Purnaprajna, Madhura
Nordstrom, Tomas
Ul-Abdin, Zain
IEEE ACCESS, 2020, 8 (08): : 57967 - 57996

← 1 2 3 4 5 →