A fast hierarchical search algorithm for discriminative keyword spotting

被引:5
|
作者
Tabibian, Shima [1 ,2 ]
Akbari, Ahmad [1 ,4 ]
Nasersharif, Babak [3 ]
机构
[1] Iran Univ Sci & Technol, Dept Comp Engn, Audio & Speech Proc Lab, Tehran 1465774111, Iran
[2] Minist Sci Res & Technol, Aerosp Res Inst, Tehran 14665834, Iran
[3] KN Toosi Univ Technol, Dept Comp Engn, Tehran, Iran
[4] Aerosp Res Inst, Aerosp Res Inst Lane,Mahestan St,Iran Zamin St, Tehran 14665834, Iran
关键词
Discriminative keyword spotting; Phone-based search; Modified Viterbi search; Hierarchical search;
D O I
10.1016/j.ins.2015.12.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A keyword spotter can be considered as a binary classifier which classifies a set of uttered sentences into two groups on the basis of whether they contain target keywords or not. For this classification task, the keyword spotter needs to identify the target keywords locations based on a fast and accurate search algorithm. In our previous works, we exploited a modified Viterbi (M-Viterbi) search algorithm which has two known drawbacks. First, to locate the target keywords, it runs an exhaustive search through all possible segments of input speech. Second, while computing the start and end time-frames of each new phone, it makes the keyword spotter to trace-back and re-evaluate the timing alignments of all previous one(s), despite the fact that very limited amount of data -if any- would get updated as a result. These two pitfalls cause a dramatically enlarged search space as well as a significant increase in computational complexity. In this paper, we propose a Hierarchical Search (H-Search) algorithm which allows the system to ignore some segments of input speech at each level of hierarchy, according to their lower likelihood of containing the target keywords. In addition, unlike the M-Viterbi algorithm, the H-Search algorithm does not demand repeated evaluations when computing the current phone alignment which, in turn, results in a narrowed-down search space (O(TP) versus O(TPLmax) - where T is number of frames, P is number of keyword phones and Lmax is the maximum phone duration) as well as a decreased computational complexity (O(TPLmax) versus O(TPLmax3)) compared to those of the M-Viterbi algorithm. We applied the H-Search algorithm to the classification part of an Evolutionary Discriminative Keyword Spotting (EDKWS) system introduced in our previous works. The experimental results indicate that the H-Search algorithm is executed 100 times faster than the M-Viterbi algorithm while the performance of the EDKWS system degrades no more than two percent compared to that of the M-Viterbi algorithm. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:45 / 59
页数:15
相关论文
共 50 条
  • [31] Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech
    Weng, Chao
    Juang, Biing-Hwang
    Povey, Daniel
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 558 - 561
  • [32] Subsyllable-based discriminative segmental Bayesian network for Mandarin speech keyword spotting
    Natl Cheng Kung Univ, Tainan, Taiwan
    [J]. IEE Proc Vision Image Signal Proc, 2 (65-71):
  • [33] Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech
    Weng, Chao
    Juang, Biing-Hwang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (02) : 300 - 312
  • [34] Improved lattice-based speech keyword spotting algorithm
    Department of Electronic Engineer, Tsinghua University, Beijing
    100084, China
    [J]. Qinghua Daxue Xuebao, 5 (508-513): : 508 - 513
  • [35] A keyword spotting method
    Guo, R
    Zhu, XY
    [J]. PROCEEDINGS OF THE 4TH ASIA-PACIFIC CONFERENCE ON CONTROL & MEASUREMENT, 2000, : 301 - 304
  • [36] Target-Aware Neural Architecture Search and Deployment for Keyword Spotting
    Busia, Paola
    Deriu, Gianfranco
    Rinelli, Luca
    Chesta, Cristina
    Raffo, Luigi
    Meloni, Paolo
    [J]. IEEE ACCESS, 2022, 10 : 40687 - 40700
  • [37] Fast fuzzy keyword spotting using syllable confusion network indexing
    Jian, Shao
    Qingwei, Zhao
    Pengyuan, Zhang
    Zhaojie, Liu
    Yonghong, Yan
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2008, 17 (02) : 265 - 269
  • [38] ROBUST DISCRIMINATIVE KEYWORD SPOTTING FOR EMOTIONALLY COLORED SPONTANEOUS SPEECH USING BIDIRECTIONAL LSTM NETWORKS
    Woellmer, Martin
    Eyben, Florian
    Keshet, Joseph
    Graves, Alex
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3949 - +
  • [39] An Algorithm for Keyword Search on an Execution Path
    Kamiya, Toshihiro
    [J]. 2014 SOFTWARE EVOLUTION WEEK - IEEE CONFERENCE ON SOFTWARE MAINTENANCE, REENGINEERING, AND REVERSE ENGINEERING (CSMR-WCRE), 2014, : 328 - 332
  • [40] A keyword searching algorithm for Search Engines
    Gupta, Vishal
    [J]. 2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2, 2007, : 517 - 521