A Hybrid HMM/DNN Approach to Keyword Spotting of Short Words

被引:0
|
作者
Chen, I-Fan [1 ]
Lee, Chin-Hui [1 ]
机构
[1] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
关键词
keyword and filler modeling; keyword detection; utterance verification; deep neural networks; knowledge-based; RECOGNITION; FEATURES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An HMM/DNN framework is proposed to address the issues of short-word detection. The first-stage keyword hypothesizer is redesigned with a context-aware keyword model and a 9 state filler model to reduce the miss rate from 80% to 6% and increase the figure-of-merit (FOM) from 6.08% to 21.88% for short words. The hypothesizer is followed by a MLP-based second-stage keyword verifier to further reduce its putative hits. To enhance short word detection, three new techniques, including an HMM-based feature transfonnation for the MLPs, knowledge-based features, and deep neural networks, are incorporated into redesigning the verifier. With a set of nine short keywords from the TIMIT set the best FOM we had achieved for the proposed KWS system was 42.79%, which is comparable with that of 42.6% for long content words and much better than the FOM of 18.4% for short keywords reported in previous research [10].
引用
收藏
页码:1573 / 1577
页数:5
相关论文
共 50 条
  • [41] Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting
    Woellmer, Martin
    Marchi, Erik
    Squartini, Stefano
    Schuller, Bjoern
    COGNITIVE NEURODYNAMICS, 2011, 5 (03) : 253 - 264
  • [42] On quantifying the quality of acoustic models in hybrid DNN-HMM ASR
    Dighe, Pranay
    Asaei, Afsaneh
    Bourlard, Herve
    SPEECH COMMUNICATION, 2020, 119 : 24 - 35
  • [43] Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting
    Martin Wöllmer
    Erik Marchi
    Stefano Squartini
    Björn Schuller
    Cognitive Neurodynamics, 2011, 5 : 253 - 264
  • [44] Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers
    Ochiai, Tsubasa
    Matsuda, Shigeki
    Watanabe, Hideyuki
    Lu, Xugang
    Hori, Chiori
    Kawai, Hisashi
    Katagiri, Shigeru
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (10): : 2431 - 2443
  • [45] Custom Mandarin Keyword Spotting with Extended Long Short-Term Memory
    Cao, Haitao
    Liu, Xi
    Tan, Zhiguo
    Yang, Zhenlun
    Qin, Xin
    IAENG International Journal of Computer Science, 2024, 51 (12) : 1933 - 1942
  • [46] A RECURRENT NEURAL NETWORKS APPROACH FOR KEYWORD SPOTTING APPLIED ON ROMANIAN LANGUAGE
    Pipa, Sonia
    Boros, Tiberiu
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2016, : 111 - 120
  • [47] Using Hybrid HMM/DNN Embedding Extractor Models in Computational Paralinguistic Tasks
    Vetrab, Mercedes
    Gosztolya, Gabor
    SENSORS, 2023, 23 (11)
  • [48] Uncertainty decoding for DNN-HMM hybrid systems based on numerical sampling
    Huemmer, Christian
    Maas, Roland
    Schwarz, Andreas
    Astudillo, Ramon Fernandez
    Kellermann, Walter
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3556 - 3560
  • [49] Boosted Hybrid DNN/HMM System Based on Correlation-Generated Targets
    Chen, Mengzhe
    Zhang, Qingqing
    Pan, Jielin
    Yan, Yonghong
    2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 590 - 593
  • [50] Large Vocabulary Hybrid DNN/HMM Arabic Online Handwriting Recognition System
    Khaled, Omar
    Fahmy, Aly
    Abdou, Sherif
    PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 876 - 881