Multi-layer LSTM Parallel Optimization Based on Hardware and Software Cooperation

被引:0
|
作者
Chen, Qingfeng [1 ]
Wu, Jing [1 ]
Huang, Feihu [1 ]
Han, Yu [1 ]
Zhao, Qiming [1 ]
机构
[1] Wuhan Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan 430065, Peoples R China
关键词
LSTM; Software and hardware cooperation; Parallelism; RNN; NLP; COMPRESSION; PREDICTION; SYSTEMS;
D O I
10.1007/978-3-031-10986-7_55
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LSTM's special gate structure and memory unit make it suitable for solving problems that are related to time series. It has excellent performance in the fields of machine translation and reasoning. However, LSTM also has some shortcomings, such as low parallelism, which leads to insufficient computing speed. Some existing optimization ideas only focus on one of the software and hardware. The former mostly focuses on model accuracy, and CPU accelerated LSTM doesn't dynamically adjust to network characteristics; While the latter can be based on the LSTM model structure. Customized accelerators are often limited by the structure of LSTM and cannot fully utilize the advantages of the hardware. This paper proposed a multi-layer LSTM optimization scheme based on the idea of software and hardware collaboration. We used the pruning by row scheme to greatly reduce the number of parameters while ensuring accuracy, making it adapt to the parallel structure of the hardware. From the perspective of software, the multi-layer LSTM module was analyzed. It was concluded that some neurons in different layers could be calculated in parallel. Therefore, this paper redesigned the computational order of the multilayer LSTM so that the model guaranteed its own timing properly and it was hardware friendly at the same time. Experiments showed that our throughput increased by 10x compared with the CPU implementation. Compared with other hardware accelerators, the throughput increased by 1.2x-1.4x, and the latency and resource utilization had also been improved.
引用
下载
收藏
页码:681 / 693
页数:13
相关论文
共 50 条
  • [1] Multi-layer software reliability for unreliable hardware
    Shafique, Muhammad
    Axer, Philip
    Borchert, Christoph
    Chen, Jian-Jia
    Chen, Kuan-Hsun
    Doebel, Bjoern
    Ernst, Rolf
    Haertig, Hermann
    Heinig, Andreas
    Kapitza, Ruediger
    Kriebel, Florian
    Lohmann, Daniel
    Marwedel, Peter
    Rehman, Semeen
    Schmoll, Florian
    Spinczyk, Olaf
    IT-INFORMATION TECHNOLOGY, 2015, 57 (03): : 170 - 180
  • [2] IP Core Based Hardware Implementation of Multi-layer Perceptrons on FPGAs: A Parallel Approach
    Li, Xiaojun
    Li, Lin
    MATERIALS SCIENCE AND INFORMATION TECHNOLOGY, PTS 1-8, 2012, 433-440 : 5647 - 5653
  • [3] Payload-Based Traffic Classification Using Multi-Layer LSTM in Software Defined Networks
    Lim, Hyun-Kyo
    Kim, Ju-Bong
    Kim, Kwihoon
    Hong, Yong-Geun
    Han, Youn-Hee
    APPLIED SCIENCES-BASEL, 2019, 9 (12):
  • [4] A Multi-Layer Parallel Hardware Architecture for Homomorphic Computation in Machine Learning
    Xin, Guozhu
    Zhao, Yifan
    Han, Jun
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [5] A Multi-Layer Parallel LSTM Network for Human Activity Recognition with Smartphone Sensors
    Yu, Tao
    Chen, Jianxin
    Yan, Na
    Liu, Xipeng
    2018 10TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2018,
  • [6] Multi-layer LSTM network statement generation based on mixed input
    Liu, Qingqing
    Xia, Zhengyou
    2019 SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2019, : 223 - 228
  • [7] Multi-layer parallel shooting method for multi-layer boundary value problems
    Allan, Fathi M.
    Hajji, Mohamed Ali
    2009 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY, 2009, : 315 - 319
  • [8] Multi-layer optimization algorithm
    Abdalla, M.
    Yamin, J.
    Al-Khawaldeh, E.
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2022, 16
  • [9] Multi-layer breakwater optimization
    Arsie, A
    Belorgey, M
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL OFFSHORE AND POLAR ENGINEERING CONFERENCE, VOL 3, 1998, : 578 - 583
  • [10] Stochastic parallel gradient descent optimization based on decoupling of the software and hardware
    Fu, Qiang
    Pott, Joerg-Uwe
    Shen, Feng
    Rao, Changhui
    OPTICS COMMUNICATIONS, 2014, 310 : 138 - 149