Deep learning parallel computing and evaluation for embedded system clustering architecture processor

被引:4
|
作者
Zu, Yue [1 ]
机构
[1] Jilin Inst Chem Technol, Dept Human Resources Off, Jilin 132022, Jilin, Peoples R China
关键词
Clustered architecture processor; Parallel computing; Deep learning; Performance evaluation;
D O I
10.1007/s10617-020-09235-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of intelligence, the processing of a large amount of information and various intelligent applications need to rely on embedded devices. This trend has made machine learning algorithms play an increasingly important role. High-performance embedded computing is an effective means to solve the lack of computing power of embedded devices. Aiming at the problem that the calculation amount of new intelligent embedded applications based on machine learning technology is higher, the computing power of traditional embedded systems is difficult to meet their needs, this paper studies the parallel optimization and implementation techniques of convolutional neural networks in Parallella platform. The parallel optimization strategy of convolutional neural network on the clustering architecture processor of heterogeneous multi-core system is given. Then the high-performance implementation of convolutional neural network on Parallella platform is studied, and the function of convolutional neural network system is implemented. A set of performance evaluation methods for embedded parallel processors is proposed. From the application point of S698P, the eCos operating system is selected as the platform. The single-core mode and multi-core mode are compared on the simulator GRSIM, and the parallel performance evaluation is given. Experiments have shown that the efficiency of deep learning tasks is significantly improved compared to traditional parallel methods.
引用
收藏
页码:145 / 159
页数:15
相关论文
共 50 条
  • [41] Architecture of a Distributed Parallel Computing System Using Docker Cluster
    Sokolov, Aleksandr
    Larionov, Andrey
    Mukhtarov, Amir
    Fedotov, Ivan
    Proceedings of the 2022 International Conference on Information, Control, and Communication Technologies, ICCT 2022, 2022,
  • [42] oclCUB: an OpenCL parallel computing library for deep learning operators
    Shi, Changqing
    Sun, Yufei
    Sui, Yicheng
    Chen, Yuqiao
    Wang, Haotian
    Zhang, Yuzhi
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2024, 6 (03) : 319 - 329
  • [43] Parallel computing application with graphics processor unit for analysis of a mechanical railway system
    Bustos-Caballero, Alejandro
    Rubio-Alonso, Higinio
    Corral-Abad, Eduardo
    Garcia-Prada, Juan-Carlos
    DYNA, 2017, 92 (06): : 608 - 609
  • [44] Deep Learning in Mobile Computing: Architecture, Applications, and Future Challenges
    Yang, Xiaoxian
    Tan, Zhiyuan
    Luo, Zhiling
    MOBILE INFORMATION SYSTEMS, 2021, 2021
  • [45] Efficient sorting design on a novel embedded parallel computing architecture with unique memory access
    Zhou, Wenbiao
    Cai, Zhaoyun
    Ding, Ruiqiang
    Gong, Chen
    Liu, Dake
    COMPUTERS & ELECTRICAL ENGINEERING, 2013, 39 (07) : 2100 - 2111
  • [46] Embedded deep learning system for defects detection
    Yi K.Y.
    Jeong S.
    Seo K.
    Transactions of the Korean Institute of Electrical Engineers, 2020, 69 (02): : 325 - 330
  • [47] Evaluation of shared DRAM for parallel processor system with shared memory
    Kurino, H
    Hirano, K
    Ono, T
    Koyanagi, M
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1998, E81A (12) : 2655 - 2660
  • [48] An Evaluation of Distillation Deep Learning Network Architecture
    Fujii, Yoshitaka
    Ichimura, Takumi
    2017 IEEE 10TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE AND APPLICATIONS (IWCIA), 2017, : 103 - 108
  • [49] PARALLEL COMPUTING OF LINE-CODINGS BY USE OF A DISPLAY PROCESSOR SYSTEM AND THE PARALLEL DETERMINATION OF A DISCRETE CURVATURE
    GOSSEL, M
    SAEDLER, J
    RECENT ISSUES IN PATTERN ANALYSIS AND RECOGNITION, 1989, 399 : 29 - 41
  • [50] PARALLEL COMPUTING OF LINE-CODINGS BY USE OF A DISPLAY PROCESSOR SYSTEM AND THE PARALLEL DETERMINATION OF A DISCRETE CURVATURE
    GOSSEL, M
    SAEDLER, J
    LECTURE NOTES IN COMPUTER SCIENCE, 1989, 399 : 29 - 41