A survey on structured discriminative spoken keyword spotting

被引:3
|
作者
Tabibian, Shima [1 ]
机构
[1] Shahid Beheshti Univ, Cyberspace Res Inst, Shahid Shahriari Sq,Daneshjou Blvd, Tehran 1983969411, Iran
关键词
Deep learning; Discriminative model; Hidden Markov model; Spoken keyword spotting; Structured data; HIDDEN MARKOV-MODELS; SPEECH RECOGNITION; CONFIDENCE MEASURES; ALGORITHM; NORMALIZATION; COMPRESSION; CLASSIFIER; NETWORKS; PHONEME; SEARCH;
D O I
10.1007/s10462-019-09739-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spoken keyword spotting refers to the detection of all occurrences of desired words in continuous speech utterances. This paper includes a comprehensive review on various spoken keyword spotting (especially discriminative spoken keyword spotting) approaches. The most common datasets and evaluation measures for training and evaluating the spoken keyword spotting systems are reviewed in this paper. Moreover, the main framework for structured discriminative spoken keyword spotting (SDKWS) is presented. Different parts of the SDKWS framework such as feature extraction, model training, search algorithm and thresholding are discussed in this paper. Finally, the paper is concluded in the conclusion section and the future works are presented in the last part of that section.
引用
收藏
页码:2483 / 2520
页数:38
相关论文
共 50 条
  • [21] A survey of keyword spotting techniques for printed document images
    Abirami Murugappan
    Baskaran Ramachandran
    P. Dhavachelvan
    [J]. Artificial Intelligence Review, 2011, 35 : 119 - 136
  • [22] Unconstrained keyword spotting using phone lattices with application to spoken document retrieval
    Foote, JT
    Young, SJ
    Jones, GJF
    Sparck-Jones, K
    [J]. COMPUTER SPEECH AND LANGUAGE, 1997, 11 (03): : 207 - 224
  • [23] Discriminative keyword spotting using triphones information and N-best search
    Tabibian, Shima
    Akbari, Ahmad
    Nasersharif, Babak
    [J]. INFORMATION SCIENCES, 2018, 423 : 157 - 171
  • [24] Subsyllable-based discriminative segmental Bayesian network for Mandarin speech keyword spotting
    Wu, CH
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1997, 144 (02): : 65 - 71
  • [25] Acoustic Segmentation Using Group Delay Functions and Its Relevance to Spoken Keyword Spotting
    Madikeri, Srikanth R.
    Murthy, Hema A.
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 496 - 504
  • [26] Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech
    Weng, Chao
    Juang, Biing-Hwang
    Povey, Daniel
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 558 - 561
  • [27] Subsyllable-based discriminative segmental Bayesian network for Mandarin speech keyword spotting
    Natl Cheng Kung Univ, Tainan, Taiwan
    [J]. IEE Proc Vision Image Signal Proc, 2 (65-71):
  • [28] Discriminative Training Using Non-Uniform Criteria for Keyword Spotting on Spontaneous Speech
    Weng, Chao
    Juang, Biing-Hwang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (02) : 300 - 312
  • [29] A keyword spotting method
    Guo, R
    Zhu, XY
    [J]. PROCEEDINGS OF THE 4TH ASIA-PACIFIC CONFERENCE ON CONTROL & MEASUREMENT, 2000, : 301 - 304
  • [30] ROBUST DISCRIMINATIVE KEYWORD SPOTTING FOR EMOTIONALLY COLORED SPONTANEOUS SPEECH USING BIDIRECTIONAL LSTM NETWORKS
    Woellmer, Martin
    Eyben, Florian
    Keshet, Joseph
    Graves, Alex
    Schuller, Bjoern
    Rigoll, Gerhard
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3949 - +