BIFOCAL NEURAL ASR: EXPLOITING KEYWORD SPOTTING FOR INFERENCE OPTIMIZATION

被引:7
|
作者
Macoskey, Jon [1 ]
Strimel, Grant P. [1 ]
Rastrow, Ariya [1 ]
机构
[1] Amazoncom, Seattle, WA 98109 USA
关键词
On-device speech recognition; recurrent neural network transducer (RNN-T); inference optimization;
D O I
10.1109/ICASSP39728.2021.9414652
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks. The architecture enables a dynamic pivot for its runtime compute pathway, namely taking advantage of keyword spotting to select which component of the network to execute for a given audio frame. To accomplish this, we leverage a recurrent cell we call the Bifocal LSTM (BF-LSTM), which we detail in the paper. The architecture is compatible with other optimization strategies such as quantization, sparsification, and applying time-reduction layers, making it especially applicable for deployed, real-time speech recognition settings. We present the architecture and report comparative experimental results on voice-assistant speech recognition tasks. Specifically, we show our proposed Bifocal RNN-T can improve inference cost by 29.1% with matching word error rates and only a minor increase in memory size.
引用
收藏
页码:5999 / 6003
页数:5
相关论文
共 50 条
  • [41] Resource-efficient DNNs for Keyword Spotting using Neural Architecture Search and Quantization
    Peter, David
    Roth, Wolfgang
    Pernkopf, Franz
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9273 - 9279
  • [42] A Channel-Pruned and Weight-Binarized Convolutional Neural Network for Keyword Spotting
    Lyu, Jiancheng
    Sheen, Spencer
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING (ICCSAMA 2019), 2020, 1121 : 243 - 254
  • [43] Small-footprint Spiking Neural Networks for Power-efficient Keyword Spotting
    Pedroni, Bruno U.
    Sheik, Sadique
    Mostafa, Hesham
    Paul, Somnath
    Augustine, Charles
    Cauwenberghs, Gert
    [J]. 2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 591 - 594
  • [44] Non-Uniform Boosted MCE Training of Deep Neural Networks for Keyword Spotting
    Meng, Zhong
    Juang, Biing-Hwang
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 770 - 774
  • [45] Multitask Learning of Deep Neural Network-Based Keyword Spotting for IoT Devices
    Leem, Seong-Gyun
    Yoo, In-Chul
    Yook, Dongsuk
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2019, 65 (02) : 188 - 194
  • [46] Deep Residual Spiking Neural Network for Keyword Spotting in Low-Resource Settings
    Yang, Qu
    Liu, Qi
    Li, Haizhou
    [J]. INTERSPEECH 2022, 2022, : 3023 - 3027
  • [47] TOWARDS ON-DEVICE KEYWORD SPOTTING USING LOW-FOOTPRINT QUATERNION NEURAL MODELS
    Chaudhary, Aryan
    Abrol, Vinayak
    [J]. 2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [48] Efficient Keyword Spotting through Hardware-Aware Conditional Execution of Deep Neural Networks
    Giraldo, J. S. P.
    O'Connor, Chris
    Verhelst, Marian
    [J]. 2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
  • [49] Studying the Effects of Feature Extraction Settings on the Accuracy and Memory Requirements of Neural Networks for Keyword Spotting
    Shahnawaz, Muhammad
    Plebani, Emanuele
    Guaneri, Ivana
    Pau, Danilo
    Marcon, Marco
    [J]. 2018 IEEE 8TH INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - BERLIN (ICCE-BERLIN), 2018,
  • [50] A Configurable Accelerator for Keyword Spotting Based on Small-Footprint Temporal Efficient Neural Network
    He, Keyan
    Chen, Dihu
    Su, Tao
    [J]. ELECTRONICS, 2022, 11 (16)