MAX-POOLING LOSS TRAINING OF LONG SHORT-TERM MEMORY NETWORKS FOR SMALL-FOOTPRINT KEYWORD SPOTTING

被引:0
|
作者
Sun, Ming [1 ]
Raju, Anirudh [2 ]
Tucker, George [3 ]
Panchapagesan, Sankaran [2 ]
Fu, Gengshen [1 ]
Mandal, Arindam [2 ]
Matsoukas, Spyros [1 ]
Strom, Nikko [4 ]
Vitaladevuni, Shiv [1 ]
机构
[1] Amazon Com, Cambridge, MA USA
[2] Amazon Com, Sunnyvale, CA USA
[3] Google Brain, Mountain View, CA USA
[4] Amazon Com, Seattle, WA USA
关键词
LSTM; keyword spotting; max-pooling loss; small-footprint;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a max-pooling based loss function for training Long Short-Term Memory (LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and latency requirements. The max-pooling loss training can be further guided by initializing with a cross-entropy loss trained network. A posterior smoothing based evaluation approach is employed to measure keyword spotting performance. Our experimental results show that LSTM models trained using cross-entropy loss or max-pooling loss outperform a cross-entropy loss trained baseline feed-forward Deep Neural Network (DNN). In addition, max-pooling loss trained LSTM with randomly initialized network performs better compared to cross-entropy loss trained LSTM. Finally, the max-pooling loss trained LSTM initialized with a cross-entropy pre-trained network shows the best performance, which yields 67.6% relative reduction compared to baseline feed-forward DNN in Area Under the Curve (AUC) measure.
引用
收藏
页码:474 / 480
页数:7
相关论文
共 50 条
  • [1] Compact Feedforward Sequential Memory Networks for Small-footprint Keyword Spotting
    Chen, Mengzhe
    Zhang, Shiliang
    Lei, Ming
    Liu, Yong
    Yao, Haitao
    Gao, Jie
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2663 - 2667
  • [2] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [3] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
    Arik, Sercan O.
    Kliegl, Markus
    Child, Rewon
    Hestness, Joel
    Gibiansky, Andrew
    Fougner, Chris
    Prenger, Ryan
    Coates, Adam
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
  • [5] Keyword spotting exploiting Long Short-Term Memory
    Woellmer, Martin
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. SPEECH COMMUNICATION, 2013, 55 (02) : 252 - 265
  • [6] Speech densely connected convolutional networks for small-footprint keyword spotting
    Tsai, Tsung-Han
    Lin, Xin-Hui
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (25) : 39119 - 39137
  • [7] Speech densely connected convolutional networks for small-footprint keyword spotting
    Tsung-Han Tsai
    Xin-Hui Lin
    [J]. Multimedia Tools and Applications, 2023, 82 : 39119 - 39137
  • [8] Domain Aware Training for Far-field Small-footprint Keyword Spotting
    Wu, Haiwei
    Jia, Yan
    Nie, Yuanfei
    Li, Ming
    [J]. INTERSPEECH 2020, 2020, : 2562 - 2566
  • [9] QUERY-BY-EXAMPLE KEYWORD SPOTTING USING LONG SHORT-TERM MEMORY NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Sainath, Tara N.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5236 - 5240
  • [10] Small-footprint Spiking Neural Networks for Power-efficient Keyword Spotting
    Pedroni, Bruno U.
    Sheik, Sadique
    Mostafa, Hesham
    Paul, Somnath
    Augustine, Charles
    Cauwenberghs, Gert
    [J]. 2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 591 - 594