Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification

被引:6
|
作者
Kao, Chieh-Chi [1 ]
Sun, Ming [1 ]
Gao, Yixin [1 ]
Vitaladevuni, Shiv [1 ]
Wang, Chao [1 ]
机构
[1] Amazon, Alexa Speech, Seattle, WA 98109 USA
来源
关键词
spoken term classification; convolutional neural network (CNN); sub-band feature; COMPRESSION;
D O I
10.21437/Interspeech.2019-1766
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper proposes a Sub-band Convolutional Neural Network for spoken term classification. Convolutional neural networks (CNNs) have proven to be very effective in acoustic applications such as spoken term classification, keyword spotting, speaker identification, acoustic event detection, etc. Unlike applications in computer vision, the spatial invariance property of 2D convolutional kernels does not fit acoustic applications well since the meaning of a specific 2D kernel varies a lot along the feature axis in an input feature map. We propose a sub-band CNN architecture to apply different convolutional kernels on each feature sub-band, which makes the overall computation more efficient. Experimental results show that the computational efficiency brought by sub-band CNN is more beneficial for small-footprint models. Compared to a baseline full band CNN for spoken term classification on a publicly available Speech Commands dataset, the proposed sub-band CNN architecture reduces the computation by 39.7% on commands classification, and 49.3% on digits classification with accuracy maintained.
引用
收藏
页码:2195 / 2199
页数:5
相关论文
共 50 条
  • [1] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [2] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
    Arik, Sercan O.
    Kliegl, Markus
    Child, Rewon
    Hestness, Joel
    Gibiansky, Andrew
    Fougner, Chris
    Prenger, Ryan
    Coates, Adam
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
  • [3] SMALL-FOOTPRINT CONVOLUTIONAL NEURAL NETWORK FOR SPOOFING DETECTION
    Dinkel, Heinrich
    Qian, Yanmin
    Yu, Kai
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3086 - 3091
  • [4] Reduced Model Size Deep Convolutional Neural Networks for Small-Footprint Keyword Spotting
    Tsai, Tsung Han
    Lin, Xin Hui
    [J]. 2021 28TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS, AND SYSTEMS (IEEE ICECS 2021), 2021,
  • [5] Speech densely connected convolutional networks for small-footprint keyword spotting
    Tsung-Han Tsai
    Xin-Hui Lin
    [J]. Multimedia Tools and Applications, 2023, 82 : 39119 - 39137
  • [6] Speech densely connected convolutional networks for small-footprint keyword spotting
    Tsai, Tsung-Han
    Lin, Xin-Hui
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 39119 - 39137
  • [7] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] Small-Footprint Highway Deep Neural Networks for Speech Recognition
    Lu, Liang
    Renals, Steve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1502 - 1511
  • [9] A Small-Footprint Accelerator for Large-Scale Neural Networks
    Chen, Tianshi
    Zhang, Shijin
    Liu, Shaoli
    Du, Zidong
    Luo, Tao
    Gao, Yuan
    Liu, Junjie
    Wang, Dongsheng
    Wu, Chengyong
    Sun, Ninghui
    Chen, Yunji
    Temam, Olivier
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2015, 33 (02):
  • [10] Automatic Sleep Staging using a Small-footprint Sensor Array and Recurrent-Convolutional Neural Networks
    Coon, William G.
    Punjabi, Naresh M.
    [J]. 2021 10TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING (NER), 2021, : 1144 - 1147