VIRTUAL ADVERSARIAL TRAINING FOR DS-CNN BASED SMALL-FOOTPRINT KEYWORD SPOTTING

被引:0
|
作者
Wang, Xiong [1 ]
Sun, Sining [1 ]
Xie, Lei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Grp, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
depthwise separable convolutional neural network; DS-CNN; KWS; virtual adversarial training;
D O I
10.1109/asru46091.2019.9003745
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Serving as the tigger of a voice-enabled user interface, on-device keyword spotting model has to be extremely compact, efficient and accurate. In this paper, we adopt a depth-wise separable convolutional neural network (DS-CNN) as our small-footprint KWS model, which is highly competitive to these ends. However, recent study has shown that a compact KWS system is very vulnerable to small adversarial perturbations while augmenting the training data with specificallygenerated adversarial examples can improve performance. In this paper, we further improve KWS performance through a virtual adversarial training (VAT) solution. Instead of using adversarial examples for data augmentation, we propose to train a DS-CNN KWS model using adversarial regularization, which aims to smooth model's distribution and thus to improve robustness, by explicitly introducing a distribution smoothness measure into the loss function. Experiments on a collected KWS corpus using a circular microphone array in far-field scenario show that the VAT approach brings 31.9% relative false rejection rate (FRR) reduction compared to the normal training approach with cross entropy loss, and it also surpasses the adversarial example based data augmentation approach with 10.3% relative FRR reduction.
引用
收藏
页码:607 / 612
页数:6
相关论文
共 50 条
  • [1] Region Proposal Network Based Small-Footprint Keyword Spotting
    Hou, Jingyong
    Shi, Yangyang
    Ostendorf, Mari
    Hwang, Mei-Yuh
    Xie, Lei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1471 - 1475
  • [2] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [3] EXPLORING REPRESENTATION LEARNING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Cui, Fan
    Guo, Liyong
    Wang, Quandong
    Gao, Peng
    Wang, Yujun
    [J]. INTERSPEECH 2022, 2022, : 3258 - 3262
  • [4] SMALL-FOOTPRINT KEYWORD SPOTTING WITH GRAPH CONVOLUTIONAL NETWORK
    Chen, Xi
    Yin, Shouyi
    Song, Dandan
    Ouyang, Peng
    Liu, Leibo
    Wei, Shaojun
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 539 - 546
  • [5] Domain Aware Training for Far-field Small-footprint Keyword Spotting
    Wu, Haiwei
    Jia, Yan
    Nie, Yuanfei
    Li, Ming
    [J]. INTERSPEECH 2020, 2020, : 2562 - 2566
  • [6] ADVERSARIAL EXAMPLES FOR IMPROVING END-TO-END ATTENTION-BASED SMALL-FOOTPRINT KEYWORD SPOTTING
    Wang, Xiong
    Sun, Sining
    Shan, Changhao
    Hou, Jingyong
    Xie, Lei
    Li, Shen
    Lei, Xin
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6366 - 6370
  • [7] Model compression applied to small-footprint keyword spotting
    Tucker, George
    Wu, Minhua
    Sun, Ming
    Panchapagesan, Sankaran
    Fu, Gengshen
    Vitaladevuni, Shiv
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1878 - 1882
  • [8] DEEP RESIDUAL LEARNING FOR SMALL-FOOTPRINT KEYWORD SPOTTING
    Tang, Raphael
    Lin, Jimmy
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5484 - 5488
  • [9] Text Anchor Based Metric Learning for Small-footprint Keyword Spotting
    Wang, Li
    Gu, Rongzhi
    Chen, Nuo
    Zou, Yuexian
    [J]. INTERSPEECH 2021, 2021, : 4219 - 4223
  • [10] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,