Deep learning parallel computing and evaluation for embedded system clustering architecture processor

被引:4
|
作者
Zu, Yue [1 ]
机构
[1] Jilin Inst Chem Technol, Dept Human Resources Off, Jilin 132022, Jilin, Peoples R China
关键词
Clustered architecture processor; Parallel computing; Deep learning; Performance evaluation;
D O I
10.1007/s10617-020-09235-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of intelligence, the processing of a large amount of information and various intelligent applications need to rely on embedded devices. This trend has made machine learning algorithms play an increasingly important role. High-performance embedded computing is an effective means to solve the lack of computing power of embedded devices. Aiming at the problem that the calculation amount of new intelligent embedded applications based on machine learning technology is higher, the computing power of traditional embedded systems is difficult to meet their needs, this paper studies the parallel optimization and implementation techniques of convolutional neural networks in Parallella platform. The parallel optimization strategy of convolutional neural network on the clustering architecture processor of heterogeneous multi-core system is given. Then the high-performance implementation of convolutional neural network on Parallella platform is studied, and the function of convolutional neural network system is implemented. A set of performance evaluation methods for embedded parallel processors is proposed. From the application point of S698P, the eCos operating system is selected as the platform. The single-core mode and multi-core mode are compared on the simulator GRSIM, and the parallel performance evaluation is given. Experiments have shown that the efficiency of deep learning tasks is significantly improved compared to traditional parallel methods.
引用
收藏
页码:145 / 159
页数:15
相关论文
共 50 条
  • [1] Deep learning parallel computing and evaluation for embedded system clustering architecture processor
    Yue Zu
    Design Automation for Embedded Systems, 2020, 24 : 145 - 159
  • [2] Research on Parallel Deep Learning for Heterogeneous Computing Architecture
    Xia, Kaijian
    Hu, Tao
    Si, Wen
    JOURNAL OF GRID COMPUTING, 2020, 18 (02) : 177 - 179
  • [3] Research on Parallel Deep Learning for Heterogeneous Computing Architecture
    Kaijian Xia
    Tao Hu
    Wen Si
    Journal of Grid Computing, 2020, 18 : 177 - 179
  • [4] A parallel embedded-processor architecture for ATM reassembly
    Hobson, RF
    Wong, PS
    IEEE-ACM TRANSACTIONS ON NETWORKING, 1999, 7 (01) : 23 - 37
  • [5] An embedded-processor architecture for parallel DSP algorithms
    Hobson, RF
    Wong, PS
    Evenson, SA
    ADVANCED SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, AND IMPLEMENTATIONS VI, 1996, 2846 : 75 - 85
  • [6] TECHNOLOGY DRIVEN ARCHITECTURE FOR INTEGRAL PARALLEL EMBEDDED COMPUTING
    Bumbacea, Petronela
    Codreanu, Valeriu
    Hobincu, Radu
    Petrica, Lucian
    Stefan, Gheorghe M.
    2010 INTERNATIONAL SEMICONDUCTOR CONFERENCE (CAS), VOLS 1 AND 2, 2010, : 35 - 42
  • [7] A PROCESSOR ARRAY MODULE FOR DISTRIBUTED, MASSIVELY-PARALLEL, EMBEDDED COMPUTING
    BENGTSSON, L
    NILSSON, K
    SVENSSON, B
    MICROPROCESSING AND MICROPROGRAMMING, 1993, 38 (1-5): : 529 - 537
  • [8] Modelling pipelines for embedded parallel processor system design
    Fleury, M
    Downton, AC
    Clark, AF
    ELECTRONICS LETTERS, 1997, 33 (22) : 1852 - 1853
  • [9] Processor architecture of MBAP for embedded image understanding system
    Liu, P
    Yao, QD
    Wu, S
    Pan, QH
    Lai, JM
    MEDIA PROCESSORS 2001, 2001, 4313 : 115 - 122
  • [10] Optimization of multitask parallel mobile edge computing strategy based on deep learning architecture
    Zongkai Liu
    Xiaoqiang Yang
    Jinxing Shen
    Design Automation for Embedded Systems, 2020, 24 : 129 - 143