Exploration of On-device End-to-End Acoustic Modeling with Neural Networks

被引:0
|
作者
Sung, Wonyong [1 ]
Lee, Lukas [1 ]
Park, Jinhwan [1 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
speech recognition; embedded systems; neural networks; multi-time step parallelization;
D O I
10.1109/sips47522.2019.9020317
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Real-time speech recognition on mobile and embedded devices is an important application of neural networks. Acoustic modeling is the fundamental part of speech recognition and is usually implemented with long short-term memory (LSTM)-based recurrent neural networks (RNNs). However, the single thread execution of an LSTM RNN is extremely slow in most embedded devices because the algorithm needs to fetch a large number of parameters from the DRAM for computing each output sample. We explore a few acoustic modeling algorithms that can be executed very efficiently on embedded devices. These algorithms reduce the overhead of memory accesses using multitime-step parallelization that computes multiple output samples at a time by reading the parameters only once from the DRAM. The algorithms considered are the quasi RNNs (QRNNs), Gated ConvNets, and diagonalized LSTMs. In addition, we explore neural networks that equip one-dimensional (1-D) convolution at each layer of these algorithms, and by which can obtain a very large performance increase in QRNNs and Gated ConvNets. The experiments were conducted using the connectionist temporal classification (CTC)-based end-to-end speech recognition on WSJ corpus. We not only significantly increase the execution speed but also obtain a much higher accuracy, compared to LSTM RNN-based modeling. Thus, this work can be applicable not only to embedded system-based implementations but also to server-based ones.
引用
收藏
页码:160 / 165
页数:6
相关论文
共 50 条
  • [31] Sequential neural networks for noetic end-to-end response selection
    Chen, Qian
    Wang, Wen
    COMPUTER SPEECH AND LANGUAGE, 2020, 62
  • [32] End-to-end heterogeneous graph neural networks for traffic assignment
    Liu T.
    Meidani H.
    Transportation Research Part C: Emerging Technologies, 2024, 165
  • [33] An End-to-End Compression Framework Based on Convolutional Neural Networks
    Tao, Wen
    Jiang, Feng
    Zhang, Shengping
    Ren, Jie
    Shi, Wuzhen
    Zuo, Wangmeng
    Guo, Xun
    Zhao, Debin
    2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 463 - 463
  • [34] AutoQNN: An End-to-End Framework for Automatically Quantizing Neural Networks
    Gong, Cheng
    Lu, Ye
    Dai, Su-Rong
    Deng, Qian
    Du, Cheng-Kun
    Li, Tao
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (02) : 401 - 420
  • [35] End-to-End Training of Deep Neural Networks in the Fourier Domain
    Fulop, Andras
    Horvath, Andras
    MATHEMATICS, 2022, 10 (12)
  • [36] Analytical Modeling of End-to-End Delay in OpenFlow Based Networks
    Iqbal, Azeem
    Javed, Uzzam
    Saleh, Saad
    Kim, Jongwon
    Alowibdi, Jalal S.
    Ilyas, Muhammad Usman
    IEEE ACCESS, 2017, 5 : 6859 - 6871
  • [37] LithoGAN: End-to-End Lithography Modeling with Generative Adversarial Networks
    Ye, Wei
    Alawieh, Mohamed Baker
    Lin, Yibo
    Pan, David Z.
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [38] MODELING END-TO-END PROTOCOLS OVER INTERCONNECTED HETEROGENEOUS NETWORKS
    WOLISZ, A
    POPESCUZELETIN, R
    COMPUTER COMMUNICATIONS, 1992, 15 (01) : 11 - 22
  • [39] A STREAMING ON-DEVICE END-TO-END MODEL SURPASSING SERVER-SIDE CONVENTIONAL MODEL QUALITY AND LATENCY
    Sainath, Tara N.
    He, Yanzhang
    Bo Li
    Narayana, Arun
    Pang, Ruoming
    Bruguier, Antoine
    Chang, Shuo-yiin
    Wei Li
    Alvarez, Raziel
    Chen, Zhifeng
    Chiu, Chung-Cheng
    Garcia, David
    Gruenstein, Alex
    Ke Hu
    Tin, Minho
    Kannan, Anjuli
    Qiao Liang
    McGraw, Ian
    Peyser, Cal
    Prabhavalkar, Rohit
    Pundak, Golan
    Rybach, David
    Yuan Shangguan
    Sheth, Yash
    Strohman, Trevor
    Visontai, Mirko
    Wu, Yonghui
    Yu Zhang
    Ding Zhao
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6059 - 6063
  • [40] End-to-end Relation Extraction using Neural Networks and Markov Logic Networks
    Pawar, Sachin
    Bhattacharyya, Pushpak
    Palshikar, Girish K.
    15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, 2017, : 818 - 827