Accelerating CNN Algorithm with Fine-grained Dataflow Architectures

被引：3

作者：

Xiang, Taoran ^{[1
,2
]}

Feng, Yujing ^{[1
]}

Ye, Xiaochun ^{[1
]}

Tan, Xu ^{[1
,2
]}

Li, Wenming ^{[1
]}

Zhu, Yatao ^{[1
]}

Wu, Meng ^{[1
]}

Zhang, Hao ^{[1
]}

Fan, Dongrui ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, ICT, State Key Lab Comp Architecture, Beijing, Peoples R China

[2] UCAS, Sch Comp & Control Engn, Beijing, Peoples R China

来源：

IEEE 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS / IEEE 16TH INTERNATIONAL CONFERENCE ON SMART CITY / IEEE 4TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS) | 2018年

基金：

中国国家自然科学基金;

关键词：

fine-grained dataflow; Convolutional Neural Network; general accelerator; data reuse; high parallel;

D O I：

10.1109/HPCC/SmartCity/DSS.2018.00063

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional Neural Network(CNN) is a hot and state-of-the-art algorithm which is widely used in applications such as face recognition, intelligent monitoring, image recognition and text recognition. Because of its high computational complexity, many efficient hardware accelerators have been proposed to exploit high degree of parallel processing for CNN. However, accelerators which are implemented on FPGAs and ASICs usually sacrifice generality for higher performance and lower power consumption. Other accelerators, such as GPUs, are general enough, but they lead to higher power consumption. Fine-grained dataflow architectures, which break conventional Von Neumann architectures, show natural advantages in processing CNN-like algorithms with high computational efficiency and low power consumption. At the same time, it remains broadly applicable and adaptable. In this paper, we propose a scheme for implementing and optimizing CNN on fine-grained dataflow architecture based accelerators. The experiment results reveal that by using our scheme, the performance of AlexNet running on the dataflow accelerator is 3.11x higher than that on NVIDIA Tesla K80, and the power consumption of our hardware is 8.52x lower than that of K80.

引用

页码：243 / 251

页数：9

共 50 条

[11] Scalable Fine-Grained Metric-Based Remeshing Algorithm for Manycore/NUMA Architectures
Rakotoarivelo, Hoby
Ledoux, Franck
Pommereau, Franck
Le-Goff, Nicolas
EURO-PAR 2017: PARALLEL PROCESSING, 2017, 10417 : 594 - 606
[12] Fine-Grained Scheduling in Heterogeneous-ISA Architectures
Boran, Nirmal Kumar
Rathore, Shubhankit
Udeshi, Meet
Singh, Virendra
IEEE COMPUTER ARCHITECTURE LETTERS, 2021, 20 (01) : 9 - 12
[13] Neural Architectures for Fine-grained Entity Type Classification
Shimaoka, Sonse
Stenetorp, Pontus
Inui, Kentaro
Riedel, Sebastian
15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 1271 - 1280
[14] Fine-Grained Instruction Placement in Polymorphic Computing Architectures
Hentrich, David
Oruklu, Erdal
Saniie, Jafar
2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
[15] Fine-Grained Crowdsourcing for Fine-Grained Recognition
Jia Deng
Krause, Jonathan
Li Fei-Fei
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
[16] A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN Classifiers
Anwar, Saeed
Barnes, Nick
Petersson, Lars
ELECTRONICS, 2023, 12 (23)
[17] A parallel particle swarm optimization algorithm based on fine-grained model with GPU-accelerating
Li, Jian-Ming
Wan, Dan-Ling
Chi, Zhong-Xian
Hu, Xiang-Pei
Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2006, 38 (12): : 2162 - 2166
[18] A Fine-Grained Multicasting of Configuration Data for Coarse-Grained Reconfigurable Architectures
Kojima, Takuya
Amano, Hideharu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (07): : 1247 - 1256
[19] Accelerating a Lossy Compression Method with Fine-Grained Parallelism on a GPU
Wu, Yifan
Shen, Jingcheng
Okita, Masao
Ino, Fumihiko
PAAP 2021: 2021 12TH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING, 2021, : 76 - 81
[20] Fine-Grained Exploitation of Mixed Precision for Faster CNN Training
Johnston, Travis
Young, Steven R.
Schuman, Catherine D.
Chae, Junghoon
March, Don D.
Patton, Robert M.
Potok, Thomas E.
PROCEEDINGS OF 2019 5TH IEEE/ACM WORKSHOP ON MACHINE LEARNING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS (MLHPC 2019), 2019, : 9 - 18

← 1 2 3 4 5 →