PIE: A Pipeline Energy-efficient Accelerator for Inference Process in Deep Neural Networks

被引：0

作者：

Zhao, Yangyang ^{[1
]}

Yu, Qi ^{[1
]}

Zhou, Xuda ^{[1
]}

Zhou, Xuehai ^{[1
]}

Wang, Chao ^{[1
]}

Li, Xi ^{[1
]}

机构：

[1] USTC, Dept Comp Sci & Technol, Hefei, Peoples R China

来源：

2016 IEEE 22ND INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS) | 2016年

基金：

美国国家科学基金会;

关键词：

accelerator; deep neural networks; FPGA; pipeline; inference;

D O I：

10.1109/ICPADS.2016.139

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

It has been a new research hot topic to speed up the inference process of deep neural networks (DNNs) by hardware accelerators based on field programmable gate arrays (FPGAs). Because of the layer-wise structure and data dependency between layers, previous studies commonly focus on the inherent parallelism of a single layer to reduce the computation time but neglect the parallelism between layers. In this paper, we propose a pipeline energy-efficient accelerator named PIE to accelerate the DNN inference computation by pipelining two adjacent layers. Through realizing two adjacent layers in different calculation orders, the data dependency between layers can be weakened. As soon as a layer produces an output, the next layer reads the output as an input and starts the parallel computation immediately in another calculation method. In such a way, computations between adjacent layers are pipelined. We conduct our experiments on a Zedboard development kit using Xilinx Zynq-7000 FPGA, compared with Intel Core i7 4.0GHz CPU and NVIDIA K40C GPU. Experimental results indicate that PIE is 4.82x faster than CPU and can reduce the energy consumptions of CPU and GPU by 355.35x and 12.02x respectively. Besides, compared with the none-pipelined method that layers are processed in serial, PIE improves the performance by nearly 50%.

引用

页码：1067 / 1074

页数：8

共 50 条

[41] SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks
Tran, Thi Diem
Nakashima, Yasuhiko
[J]. IEICE TRANSACTIONS ON ELECTRONICS, 2021, E104C (07) : 319 - 329
[42] Energy-Efficient and High-Throughput FPGA-based Accelerator for Convolutional Neural Networks
Feng, Gan
Hu, Zuyi
Chen, Song
Wu, Feng
[J]. 2016 13TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2016, : 624 - 626
[43] BitBlade: Energy-Efficient Variable Bit-Precision Hardware Accelerator for Quantized Neural Networks
Ryu, Sungju
Kim, Hyungjun
Yi, Wooseok
Kim, Eunhwan
Kim, Yulhwa
Kim, Taesu
Kim, Jae-Joon
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2022, 57 (06) : 1924 - 1935
[44] Towards Energy-Efficient Spiking Neural Networks: A Robust Hybrid CMOS-Memristive Accelerator
Nowshin, Fabiha
An, Hongyu
Yi, Yang
[J]. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2024, 20 (01)
[45] TIE: Energy-efficient Tensor Train-based Inference Engine for Deep Neural Network
Deng, Chunhua
Sun, Fangxuan
Qian, Xuehai
Lin, Jun
Wang, Zhongfeng
Yuan, Bo
[J]. PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, : 264 - 277
[46] Mapping Model and Heuristics for Accelerating Deep Neural Networks and for Energy-Efficient Networks-on-Chip
Reza, Md Farhadur
Yeazel, Alex
[J]. SOUTHEASTCON 2024, 2024, : 119 - 126
[47] A Resource-Efficient Inference Accelerator for Binary Convolutional Neural Networks
Kim, Tae-Hwan
Shin, Jihoon
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2021, 68 (01) : 451 - 455
[48] COMPRESSING DEEP NEURAL NETWORKS FOR EFFICIENT VISUAL INFERENCE
Ge, Shiming
Luo, Zhao
Zhao, Shengwei
Jin, Xin
Zhang, Xiao-Yu
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 667 - 672
[49] PIMCA: A Programmable In-Memory Computing Accelerator for Energy-Efficient DNN Inference
Zhang, Bo
Yin, Shihui
Kim, Minkyu
Saikia, Jyotishman
Kwon, Soonwan
Myung, Sungmeen
Kim, Hyunsoo
Kim, Sang Joon
Seo, Jae-Sun
Seok, Mingoo
[J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2023, 58 (05) : 1436 - 1449
[50] An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning Level
Maleki, Mohammad-Ali
Nabipour-Meybodi, Alireza
Kamal, Mehdi
Afzali-Kusha, Ali
Pedram, Massoud
[J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2021, 26 (06)

← 1 2 3 4 5 →