An empirical study of cross-lingual transfer learning techniques for small-footprint keyword spotting

被引:7
|
作者
Sun, Ming [1 ]
Schwarz, Andreas [1 ]
Wu, Minhua [1 ]
Strom, Nikko [1 ]
Matsoukas, Spyros [1 ]
Vitaladevuni, Shiv [1 ]
机构
[1] Amazon Com, Alexa Machine Learning, Seattle, WA USA
关键词
transfer learning; keyword spotting; cross lingual; small-footprint; CANONICAL CORRELATION-ANALYSIS; SPEECH;
D O I
10.1109/ICMLA.2017.0-150
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents our work on building a small footprint keyword spotting system for a resource-limited language, which requires low CPU, memory and latency. Our keyword spotting system consists of deep neural network (DNN) and hidden Markov model (HMM), which is a hybrid DNN-HMM decoder. We investigate different transfer learning techniques to leverage knowledge and data from a resource-abundant source language to improve the keyword DNN training for a target language which has limited in-domain data. The approaches employed in this paper include training a DNN using source language data to initialize the target language DNN training, mixing data from source and target languages together in a multi-task DNN training setup, using logits computed from a DNN trained on the source language data to regularize the keyword DNN training in the target language, as well as combinations of these techniques. Given different amounts of target language training data, our experimental results show that these transfer learning techniques successfully improve keyword spotting performance for the target language, measured by the area under the curve (AUC) of DNN-HMM decoding detection error tradeoff (DET) curves using a large in-house far-field test set.
引用
收藏
页码:255 / 260
页数:6
相关论文
共 50 条
  • [1] EXPLORING REPRESENTATION LEARNING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Cui, Fan
    Guo, Liyong
    Wang, Quandong
    Gao, Peng
    Wang, Yujun
    [J]. INTERSPEECH 2022, 2022, : 3258 - 3262
  • [2] DEEP RESIDUAL LEARNING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Tang, Raphael
    Lin, Jimmy
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5484 - 5488
  • [3] Selective transfer subspace learning for small-footprint end-to-end cross-domain keyword spotting
    Ma, Fei
    Wang, Chengliang
    Li, Xusheng
    Zeng, Zhuo
    [J]. SPEECH COMMUNICATION, 2024, 156
  • [4] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [5] Text Anchor Based Metric Learning for Small-footprint Keyword Spotting
    Wang, Li
    Gu, Rongzhi
    Chen, Nuo
    Zou, Yuexian
    [J]. INTERSPEECH 2021, 2021, : 4219 - 4223
  • [6] SMALL-FOOTPRINT KEYWORD SPOTTING WITH GRAPH CONVOLUTIONAL NETWORK
    Chen, Xi
    Yin, Shouyi
    Song, Dandan
    Ouyang, Peng
    Liu, Leibo
    Wei, Shaojun
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 539 - 546
  • [7] Model compression applied to small-footprint keyword spotting
    Tucker, George
    Wu, Minhua
    Sun, Ming
    Panchapagesan, Sankaran
    Fu, Gengshen
    Vitaladevuni, Shiv
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1878 - 1882
  • [8] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] Region Proposal Network Based Small-Footprint Keyword Spotting
    Hou, Jingyong
    Shi, Yangyang
    Ostendorf, Mari
    Hwang, Mei-Yuh
    Xie, Lei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1471 - 1475
  • [10] IMPROVING RNN TRANSDUCER MODELING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Tian, Yao
    Yao, Haitao
    Cai, Meng
    Liu, Yaming
    Ma, Zejun
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5624 - 5628