A Depthwise Separable Convolution Neural Network for Small-footprint Keyword Spotting Using Approximate MAC Unit and Streaming Convolution Reuse

被引：0

作者：

Lu, Yicheng ^{[1
]}

Shan, Weiwei ^{[1
]}

Xu, Jiaming ^{[1
]}

机构：

[1] Southeast Univ, Sch Elect Sci & Engn, Nanjing 210096, Peoples R China

来源：

2019 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS (APCCAS 2019) | 2019年

关键词：

Keyword spotting; Approximate computing; Data resue; Depthwise separable convolution;

D O I：

10.1109/apccas47518.2019.8953096

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In recent years, many applications of voice wake-up technology have entered people's lives and the key technology is Keyword Spotting (KWS). The keyword spotting system needs to detect the ambient voice and wait for a wake-up at any time, which requires low power consumption and high recognition accuracy. We mainly aim at reducing the power consumption of real-time keyword spotting systems in this paper. Based on Google's speech commands dataset (GSCD), a deep neural network model with Depthwise Separable Convolution (DS-Conv) is constructed and trained. We propose a kind of Approximate Multiply and Accumulate Unit (AP-MAC) and a data reuse method called Streaming Convolution Reuse (SCR) and prove that the neural network with AP-MACs saves 37.7% similar to 42.6% of computing power and achieves similar Word Error Rate (WER) compared to the same model using traditional MAC units in KWS task. Also, SCR allows the model to reuse convolution results for multiple audio frames and saves 94% of activations storage. By combining these two methods, the computing power and memory storage per audio frame of the baseline model are reduced by 98.5% similar to 98.7% and 94% respectively.

引用

页码：309 / 312

页数：4

共 50 条

[1] Small-Footprint Keyword Spotting with Multi-Scale Temporal Convolution
Li, Ximin
Wei, Xiaodong
Qin, Xiaowei
[J]. INTERSPEECH 2020, 2020, : 1987 - 1991
[2] Depthwise Separable Convolutional ResNet with Squeeze-and-Excitation Blocks for Small-footprint Keyword Spotting
Xu, Menglong
Zhang, Xiao-Lei
[J]. INTERSPEECH 2020, 2020, : 2547 - 2551
[3] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
Chen, Guoguo
Parada, Carolina
Heigold, Georg
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[4] Convolutional Neural Networks for Small-footprint Keyword Spotting
Sainath, Tara N.
Parada, Carolina
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
[5] Compressed time delay neural network for small-footprint keyword spotting
Sun, Ming
Snyder, David
Gao, Yixin
Nagaraja, Varun
Rodehorst, Mike
Panchapagesan, Sankaran
Strom, Nikko
Matsoukas, Spyros
Vitaladevuni, Shiv
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3607 - 3611
[6] STREAMING SMALL-FOOTPRINT KEYWORD SPOTTING USING SEQUENCE-TO-SEQUENCE MODELS
He, Yanzhang
Prabhavalkar, Rohit
Rao, Kanishka
Li, Wei
Bakhtin, Anton
McGraw, Ian
[J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 474 - 481
[7] SMALL-FOOTPRINT KEYWORD SPOTTING WITH GRAPH CONVOLUTIONAL NETWORK
Chen, Xi
Yin, Shouyi
Song, Dandan
Ouyang, Peng
Liu, Leibo
Wei, Shaojun
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 539 - 546
[8] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
Arik, Sercan O.
Kliegl, Markus
Child, Rewon
Hestness, Joel
Gibiansky, Andrew
Fougner, Chris
Prenger, Ryan
Coates, Adam
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
[9] Region Proposal Network Based Small-Footprint Keyword Spotting
Hou, Jingyong
Shi, Yangyang
Ostendorf, Mari
Hwang, Mei-Yuh
Xie, Lei
[J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1471 - 1475
[10] A depthwise separable convolutional neural network for keyword spotting on an embedded system
Peter Mølgaard Sørensen
Bastian Epp
Tobias May
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2020

← 1 2 3 4 5 →