Highly-Efficient Parallel Convolution Acceleration by Using Multiple GPUs

被引:0
|
作者
Sun, Kuangyuan [1 ]
Li, Shuai [1 ]
Luo, Yukui [1 ]
Renteria, Raul [1 ]
Choi, Ken [1 ]
机构
[1] IIT, Dept Elect & Comp Engn, VLSI Design & Automat Lab, 3301 S Dearborn St, Chicago, IL 60616 USA
关键词
Convolutional neural network; parallel acceleration; multiple GPUs;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional Neural Network (CNN) is a powerful tool in machine learning area. However, the convolution computation is time-consuming, which limited the application on embedded system. In this paper, we introduce a parallel convolution acceleration implementation by using multiple GPUs Mali-T628 MP6 on embedded system Odroid XU4 and have tested its time reduction and GPU utilization. The result show that the execution time is reduced 25.8% on average.
引用
收藏
页码:300 / 301
页数:2
相关论文
共 50 条
  • [31] Enabling Efficient Fast Convolution Algorithms on GPUs via MegaKernels
    Jia, Liancheng
    Liang, Yun
    Li, Xiuhong
    Lu, Liqiang
    Yan, Shengen
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (07) : 986 - 997
  • [32] Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
    Yokota, Rio
    Barba, L. A.
    Narumi, Tetsu
    Yasuoka, Kenji
    COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (03) : 445 - 455
  • [33] LIGHT-INTENSITY STABILIZATION USING HIGHLY-EFFICIENT FARADAY ROTATOR
    YOSHINO, T
    UMEGAKI, S
    INOUE, H
    KUROSAWA, K
    JAPANESE JOURNAL OF APPLIED PHYSICS PART 1-REGULAR PAPERS SHORT NOTES & REVIEW PAPERS, 1982, 21 (04): : 612 - 616
  • [34] Parallel Acceleration on Simulation of a 2D Takeuchi Electrophysiology Cardiac Model Using GPUs
    Qiu, Feng
    Liu, Baohua
    Shen, Wenfeng
    Shen, Yanghua
    Hu, Yaopeng
    Zhu, Xin
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2665 - 2668
  • [35] Memory-Efficient Parallel Simulation of Electron Beam Dynamics Using GPUs
    Arumugam, Kamesh
    Godunov, Alexander
    Ranjan, Desh
    Terzic, Balsa
    Zubair, Mohammad
    PROCEEDINGS OF 2016 IEEE 23RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2016, : 212 - 221
  • [36] A Winograd-Based Highly-Parallel Convolution Engine for 8-bit CNN Acceleration
    Chen, Yong-Tai
    Ou, Yu-Feng
    Huang, Chao-Tsung
    2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 395 - 398
  • [37] A highly-efficient transparent online memory test
    Thaller, K
    INTERNATIONAL TEST CONFERENCE 2001, PROCEEDINGS, 2001, : 230 - 239
  • [38] A Highly-Efficient, Unidirectional Miniaturized Slot Antenna
    Al-Joumayly, Mudar A.
    Behdad, Nader
    2008 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM, VOLS 1-9, 2008, : 1128 - 1131
  • [39] A Highly Parallel Reuse Distance Analysis Algorithm on GPUs
    Cui, Huimin
    Yi, Qing
    Xue, Jingling
    Wang, Lei
    Yang, Yang
    Feng, Xiaobing
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 1080 - 1092
  • [40] Highly Parallel Transformation and Quantization for HEVC Encoder on GPUs
    Igarashi, Hiroaki
    Takano, Fumiyo
    Moriyoshi, Tatsuji
    2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,