STREAMING SMALL-FOOTPRINT KEYWORD SPOTTING USING SEQUENCE-TO-SEQUENCE MODELS

被引:0
|
作者
He, Yanzhang [1 ]
Prabhavalkar, Rohit [1 ]
Rao, Kanishka [1 ]
Li, Wei [1 ]
Bakhtin, Anton [1 ]
McGraw, Ian [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
关键词
Keyword spotting; sequence-to-sequence models; recurrent neural network transducer; attention; embedded speech recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We develop streaming keyword spotting systems using a recurrent neural network transducer (RNN-T) model: an all-neural, end-to-end trained, sequence-to-sequence model which jointly learns acoustic and language model components. Our models are trained to predict either phonemes or graphemes as subword units, thus allowing us to detect arbitrary keyword phrases, without any out-of-vocabulary words. In order to adapt the models to the requirements of keyword spotting, we propose a novel technique which biases the RNN-T system towards a specific keyword of interest. Our systems are compared against a strong sequence-trained, connectionist temporal classification (CTC) based "keyword-filler" baseline, which is augmented with a separate phoneme language model. Overall, our RNN-T system with the proposed biasing technique significantly improves performance over the baseline system.
引用
收藏
页码:474 / 481
页数:8
相关论文
共 50 条
  • [1] Robust Small-Footprint Keyword Spotting Using Sequence-To-Sequence Model With Connectionist Temporal Classifier
    Xuan, Xiaoguang
    Wang, Mingjiang
    Zhang, Xin
    Sun, Fengjiao
    [J]. 2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 400 - 404
  • [2] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [4] EXPLORING REPRESENTATION LEARNING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Cui, Fan
    Guo, Liyong
    Wang, Quandong
    Gao, Peng
    Wang, Yujun
    [J]. INTERSPEECH 2022, 2022, : 3258 - 3262
  • [5] Model compression applied to small-footprint keyword spotting
    Tucker, George
    Wu, Minhua
    Sun, Ming
    Panchapagesan, Sankaran
    Fu, Gengshen
    Vitaladevuni, Shiv
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1878 - 1882
  • [6] SMALL-FOOTPRINT KEYWORD SPOTTING WITH GRAPH CONVOLUTIONAL NETWORK
    Chen, Xi
    Yin, Shouyi
    Song, Dandan
    Ouyang, Peng
    Liu, Leibo
    Wei, Shaojun
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 539 - 546
  • [7] DEEP RESIDUAL LEARNING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Tang, Raphael
    Lin, Jimmy
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5484 - 5488
  • [8] Small-Footprint Keyword Spotting for Controlling Smart Home Appliances Using TCN and CRNN Models
    Alapati, Hemalatha
    Paolini, Christopher
    Chinara, Suchismita
    Sarkar, Mahasweta
    [J]. INTERNATIONAL JOURNAL OF INTERDISCIPLINARY TELECOMMUNICATIONS AND NETWORKING, 2022, 14 (01)
  • [9] Region Proposal Network Based Small-Footprint Keyword Spotting
    Hou, Jingyong
    Shi, Yangyang
    Ostendorf, Mari
    Hwang, Mei-Yuh
    Xie, Lei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1471 - 1475
  • [10] IMPROVING RNN TRANSDUCER MODELING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Tian, Yao
    Yao, Haitao
    Cai, Meng
    Liu, Yaming
    Ma, Zejun
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5624 - 5628