Context Dependent Acoustic Keyword Spotting Using Deep Neural Network

被引:0
|
作者
Wang, Guangsen
Sim, Khe Chai
机构
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Language model is an essential component of a speech recogniser. It provides the additional linguistic information to constrain the search space and guide the decoding. In this paper, language model is incorporated in the keyword spotting system to provide the contexts for the keyword models under the weighted finite state transducer framework. A context independent deep neural network is trained as the acoustic model. Three keyword contexts are investigated: the phone to keyword context, fixed length word context and the arbitrary length word context. To provide these contexts, a hybrid language model with both word and phone tokens is trained using only the word n-gram count. Three different spotting graphs are studied depending on the involved contexts: the keyword loop graph, the word fillers graph and the word loop fillers graph. These graphs are referred to as the context dependent (CD) keyword spotting graphs. The CD keyword spotting systems are evaluated on the Broadcasting News Hub4-97 F0 evaluation set. Experimental results reveal that the incorporation of the language model information provides performance gain over the baseline context independent graph without any contexts for all the three CD graphs. The best system using the arbitrary length word context has the comparable performance to the full decoding but triples the spotting speed. In addition, error analysis demonstrates that the language model information is essential to reduce both the insertion and deletion errors.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Contextual Keyword Spotting in Lecture Video With Deep Convolutional Neural Network
    Andra, Muhammad Bagus
    Usagawa, Tsuyoshi
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2017, : 198 - 203
  • [2] Keyword spotting in continuous speech using convolutional neural network
    Rostami, Amir Mohammad
    Karimi, Ali
    Akhaee, Mohammad Ali
    [J]. SPEECH COMMUNICATION, 2022, 142 : 15 - 21
  • [3] Keyword spotting in continuous speech using convolutional neural network
    Rostami, Amir Mohammad
    Karimi, Ali
    Akhaee, Mohammad Ali
    [J]. Speech Communication, 2022, 142 : 15 - 21
  • [4] Keyword spotting based on recurrent neural network
    Zhou, JL
    Liu, J
    Song, YT
    Yu, TC
    [J]. ICSP '98: 1998 FOURTH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1998, : 710 - 713
  • [5] Handwritten keyword spotting using deep neural networks and certainty prediction
    Daraee, Fatemeh
    Mozaffari, Saeed
    Razavi, Seyyed Mohammad
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 92
  • [6] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] CONTEXT DEPENDENT STATE TYING FOR SPEECH RECOGNITION USING DEEP NEURAL NETWORK ACOUSTIC MODELS
    Bacchiani, Michiel
    Rybach, David
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] Deep Convolutional Spiking Neural Networks for Keyword Spotting
    Yilmaz, Emre
    Gevrek, Ozgur Bora
    Wu, Jibin
    Chen, Yuxiang
    Meng, Xuanbo
    Li, Haizhou
    [J]. INTERSPEECH 2020, 2020, : 2557 - 2561
  • [9] Deep Residual Spiking Neural Network for Keyword Spotting in Low-Resource Settings
    Yang, Qu
    Liu, Qi
    Li, Haizhou
    [J]. INTERSPEECH 2022, 2022, : 3023 - 3027
  • [10] Multitask Learning of Deep Neural Network-Based Keyword Spotting for IoT Devices
    Leem, Seong-Gyun
    Yoo, In-Chul
    Yook, Dongsuk
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2019, 65 (02) : 188 - 194