A Depthwise Separable Convolution Neural Network for Small-footprint Keyword Spotting Using Approximate MAC Unit and Streaming Convolution Reuse

被引:0
|
作者
Lu, Yicheng [1 ]
Shan, Weiwei [1 ]
Xu, Jiaming [1 ]
机构
[1] Southeast Univ, Sch Elect Sci & Engn, Nanjing 210096, Peoples R China
关键词
Keyword spotting; Approximate computing; Data resue; Depthwise separable convolution;
D O I
10.1109/apccas47518.2019.8953096
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, many applications of voice wake-up technology have entered people's lives and the key technology is Keyword Spotting (KWS). The keyword spotting system needs to detect the ambient voice and wait for a wake-up at any time, which requires low power consumption and high recognition accuracy. We mainly aim at reducing the power consumption of real-time keyword spotting systems in this paper. Based on Google's speech commands dataset (GSCD), a deep neural network model with Depthwise Separable Convolution (DS-Conv) is constructed and trained. We propose a kind of Approximate Multiply and Accumulate Unit (AP-MAC) and a data reuse method called Streaming Convolution Reuse (SCR) and prove that the neural network with AP-MACs saves 37.7% similar to 42.6% of computing power and achieves similar Word Error Rate (WER) compared to the same model using traditional MAC units in KWS task. Also, SCR allows the model to reuse convolution results for multiple audio frames and saves 94% of activations storage. By combining these two methods, the computing power and memory storage per audio frame of the baseline model are reduced by 98.5% similar to 98.7% and 94% respectively.
引用
收藏
页码:309 / 312
页数:4
相关论文
共 50 条
  • [1] Small-Footprint Keyword Spotting with Multi-Scale Temporal Convolution
    Li, Ximin
    Wei, Xiaodong
    Qin, Xiaowei
    [J]. INTERSPEECH 2020, 2020, : 1987 - 1991
  • [2] Depthwise Separable Convolutional ResNet with Squeeze-and-Excitation Blocks for Small-footprint Keyword Spotting
    Xu, Menglong
    Zhang, Xiao-Lei
    [J]. INTERSPEECH 2020, 2020, : 2547 - 2551
  • [3] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [5] Compressed time delay neural network for small-footprint keyword spotting
    Sun, Ming
    Snyder, David
    Gao, Yixin
    Nagaraja, Varun
    Rodehorst, Mike
    Panchapagesan, Sankaran
    Strom, Nikko
    Matsoukas, Spyros
    Vitaladevuni, Shiv
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3607 - 3611
  • [6] STREAMING SMALL-FOOTPRINT KEYWORD SPOTTING USING SEQUENCE-TO-SEQUENCE MODELS
    He, Yanzhang
    Prabhavalkar, Rohit
    Rao, Kanishka
    Li, Wei
    Bakhtin, Anton
    McGraw, Ian
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 474 - 481
  • [7] SMALL-FOOTPRINT KEYWORD SPOTTING WITH GRAPH CONVOLUTIONAL NETWORK
    Chen, Xi
    Yin, Shouyi
    Song, Dandan
    Ouyang, Peng
    Liu, Leibo
    Wei, Shaojun
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 539 - 546
  • [8] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
    Arik, Sercan O.
    Kliegl, Markus
    Child, Rewon
    Hestness, Joel
    Gibiansky, Andrew
    Fougner, Chris
    Prenger, Ryan
    Coates, Adam
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
  • [9] Region Proposal Network Based Small-Footprint Keyword Spotting
    Hou, Jingyong
    Shi, Yangyang
    Ostendorf, Mari
    Hwang, Mei-Yuh
    Xie, Lei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1471 - 1475
  • [10] A depthwise separable convolutional neural network for keyword spotting on an embedded system
    Peter Mølgaard Sørensen
    Bastian Epp
    Tobias May
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2020