Deep learning parallel computing and evaluation for embedded system clustering architecture processor

被引:4
|
作者
Zu, Yue [1 ]
机构
[1] Jilin Inst Chem Technol, Dept Human Resources Off, Jilin 132022, Jilin, Peoples R China
关键词
Clustered architecture processor; Parallel computing; Deep learning; Performance evaluation;
D O I
10.1007/s10617-020-09235-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the era of intelligence, the processing of a large amount of information and various intelligent applications need to rely on embedded devices. This trend has made machine learning algorithms play an increasingly important role. High-performance embedded computing is an effective means to solve the lack of computing power of embedded devices. Aiming at the problem that the calculation amount of new intelligent embedded applications based on machine learning technology is higher, the computing power of traditional embedded systems is difficult to meet their needs, this paper studies the parallel optimization and implementation techniques of convolutional neural networks in Parallella platform. The parallel optimization strategy of convolutional neural network on the clustering architecture processor of heterogeneous multi-core system is given. Then the high-performance implementation of convolutional neural network on Parallella platform is studied, and the function of convolutional neural network system is implemented. A set of performance evaluation methods for embedded parallel processors is proposed. From the application point of S698P, the eCos operating system is selected as the platform. The single-core mode and multi-core mode are compared on the simulator GRSIM, and the parallel performance evaluation is given. Experiments have shown that the efficiency of deep learning tasks is significantly improved compared to traditional parallel methods.
引用
收藏
页码:145 / 159
页数:15
相关论文
共 50 条
  • [31] Deep Embedded Clustering of Urban Communities Using Federated Learning
    Mashhadi, Afra
    Sterner, Joshua
    Murray, Jeffrey
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [32] A parallel CNC system architecture based on Symmetric Multi-processor
    Fu, Hongya
    Li, Cong
    Fu, Yunzhong
    PROCEEDINGS OF 2016 SIXTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION & MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2016), 2016, : 634 - 637
  • [33] PolyPC: Polymorphic Parallel Computing Framework on Embedded Reconfigurable System
    Ding, Hongyuan
    Huang, Miaoqing
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [34] Proposal of a multi-threaded processor architecture for embedded systems and its evaluation
    Kobayashi, S
    Takeuchi, Y
    Kitajima, A
    Imai, M
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2001, E84A (03): : 748 - 754
  • [35] Design of embedded sensor system with parallel reconfigurable computing platform
    Kao, Chi-Chou
    Kao, Chi-Chou (cckao@mail.nutn.edu.tw), 1600, Codon Publications (31): : 266 - 274
  • [36] PURE OPTICAL PARALLEL ARRAY LOGIC SYSTEM - AN OPTICAL PARALLEL COMPUTING ARCHITECTURE
    KONISHI, T
    TANIDA, J
    ICHIOKA, Y
    IEICE TRANSACTIONS ON ELECTRONICS, 1994, E77C (01) : 30 - 34
  • [37] A Deep Learning System for Detecting IoT Web Attacks With a Joint Embedded Prediction Architecture (JEPA)
    An, Yufei
    Yu, F. Richard
    He, Ying
    Li, Jianqiang
    Chen, Jianyong
    Leung, Victor C. M.
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (06): : 6885 - 6898
  • [38] A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture
    Min, Erxue
    Guo, Xifeng
    Liu, Qiang
    Zhang, Gen
    Cui, Jianjing
    Long, Jun
    IEEE ACCESS, 2018, 6 : 39501 - 39514
  • [39] Design methodology and system for a configurable media embedded processor extensible to VLIW architecture
    Mizuno, A
    Kohno, K
    Ohyama, R
    Tokuyoshi, T
    Uetani, H
    Eichel, H
    Miyamori, T
    Matsumoto, N
    Matsui, M
    ICCD'2002: IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN: VLSI IN COMPUTERS AND PROCESSORS, PROCEEDINGS, 2002, : 2 - 7
  • [40] An Energy-Efficient and Scalable Deep Learning/Inference Processor With Tetra-Parallel MIMD Architecture for Big Data Applications
    Park, Seong-Wook
    Park, Junyoung
    Bong, Kyeongryeol
    Shin, Dongjoo
    Lee, Jinmook
    Choi, Sungpill
    Yoo, Hoi-Jun
    IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2015, 9 (06) : 838 - 848