Deep learning parallel computing and evaluation for embedded system clustering architecture processor

被引：4

作者：

Zu, Yue ^{[1
]}

机构：

[1] Jilin Inst Chem Technol, Dept Human Resources Off, Jilin 132022, Jilin, Peoples R China

来源：

DESIGN AUTOMATION FOR EMBEDDED SYSTEMS | 2020年 / 24卷 / 03期

关键词：

Clustered architecture processor; Parallel computing; Deep learning; Performance evaluation;

D O I：

10.1007/s10617-020-09235-5

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the era of intelligence, the processing of a large amount of information and various intelligent applications need to rely on embedded devices. This trend has made machine learning algorithms play an increasingly important role. High-performance embedded computing is an effective means to solve the lack of computing power of embedded devices. Aiming at the problem that the calculation amount of new intelligent embedded applications based on machine learning technology is higher, the computing power of traditional embedded systems is difficult to meet their needs, this paper studies the parallel optimization and implementation techniques of convolutional neural networks in Parallella platform. The parallel optimization strategy of convolutional neural network on the clustering architecture processor of heterogeneous multi-core system is given. Then the high-performance implementation of convolutional neural network on Parallella platform is studied, and the function of convolutional neural network system is implemented. A set of performance evaluation methods for embedded parallel processors is proposed. From the application point of S698P, the eCos operating system is selected as the platform. The single-core mode and multi-core mode are compared on the simulator GRSIM, and the parallel performance evaluation is given. Experiments have shown that the efficiency of deep learning tasks is significantly improved compared to traditional parallel methods.

引用

页码：145 / 159

页数：15

共 50 条

[21] Parallel embedded processor architecture for FPGA-based image processing using parallel software skeletons
Hanen Chenini
Jean Pierre Dérutin
Romuald Aufrère
Roland Chapuis
EURASIP Journal on Advances in Signal Processing, 2013 (1)
[22] Learning Distributed Representations and Deep Embedded Clustering of Texts
Wang, Shuang
Beheshti, Amin
Wang, Yufei
Lu, Jianchao
Sheng, Quan Z.
Elbourn, Stephen
Alinejad-Rokny, Hamid
ALGORITHMS, 2023, 16 (03)
[23] An Efficient Hardware Architecture for Activation Function in Deep Learning Processor
Li, Lin
Zhang, Shengbing
Wu, Juan
2018 IEEE 3RD INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC), 2018, : 911 - 918
[24] A Deep Learning Convolution Architecture for Simple Embedded Applications
Kim, Chan
Cho, Yong Cheol Peter
Kwon, Youngsu
2017 IEEE 7TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - BERLIN (ICCE-BERLIN), 2017, : 74 - 78
[25] Parallel hierarchical clustering algorithms on processor arrays with a reconfigurable bus system
Tsai, HR
Horng, SJ
Lee, SS
Tsai, SS
Kao, TW
PATTERN RECOGNITION, 1997, 30 (05) : 801 - 815
[26] Parallel Computing in Deep Learning: bioinformatics case studies
Giansanti, Valentina
Beretta, Stefano
Cesini, Daniele
D'Agostino, Daniele
Merelli, Ivan
2019 27TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP), 2019, : 329 - 333
[27] ParaStation: Efficient parallel computing by clustering workstations: Design and evaluation
Warschko, TM
Blum, JM
Tichy, WF
JOURNAL OF SYSTEMS ARCHITECTURE, 1997, 44 (3-4) : 241 - 260
[28] A Novel DSP Architecture for Scientific Computing and Deep Learning
Yang, Chao
Chen, Shuming
Zhang, Jian
Lv, Zhao
Wang, Zhi
IEEE ACCESS, 2019, 7 : 36413 - 36425
[29] Parallel architecture benchmarking: from embedded computing to HPC, a FiPS project perspective
Lhuillier, Yves
Philippe, Jean-Marc
Guerre, Alexandre
Kierzynka, Michal
Oleksiak, Ariel
2014 12TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2014), 2014, : 154 - 161
[30] Massively scalable prototype learning for heterogeneous parallel computing architecture
Su T.
Li S.
Deng S.
Yu Y.
Bai W.
Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2016, 48 (11): : 53 - 60

← 1 2 3 4 5 →