Discriminative keyword spotting

被引:73
|
作者
Keshet, Joseph [1 ]
Grangier, David [2 ]
Bengio, Samy [3 ]
机构
[1] IDIAP Res Inst, CH-1920 Martigny, Switzerland
[2] NEC Labs Amer, Princeton, NJ 08540 USA
[3] Google Inc, Mountain View, CA 94043 USA
关键词
Keyword spotting; Spoken term detection; Speech recognition; Large margin and kernel methods; Support vector machines; Discriminative models;
D O I
10.1016/j.specom.2008.10.002
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a new approach for keyword spotting, which is based on large margin and kernel methods rather than on HMMs. Unlike previous approaches, the proposed method employs a discriminative learning procedure, in which the learning phase aims at achieving a high area under the ROC curve, as this quantity is the most common measure to evaluate keyword spotters. The keyword spotter we devise is based oil mapping the input acoustic representation of the speech utterance along with the target keyword into a vector-space. Building on techniques used for large margin and kernel methods for predicting whole sequences, our keyword spotter distills to a classifier in this vector-space, which separates speech utterances in which the keyword is uttered from speech utterances in which the keyword is not uttered. We describe a simple iterative algorithm for training the keyword spotter and discuss its formal properties, showing theoretically that it attains high area under the ROC curve. Experiments on read speech with the TIMIT corpus show that the resulted discriminative system outperforms the conventional context-independent HMM-based system. Further experiments using the TIMIT trained model, but tested oil both read (HTIMIT, WSJ) and spontaneous speech (OGI Stories), show that without further training or adaptation to the new corpus our discriminative system outperforms the conventional context-independent HMM-based system. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:317 / 329
页数:13
相关论文
共 50 条
  • [21] Analog LSTM for Keyword Spotting
    Odame, Kofi
    Nyamukuru, Maria
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 375 - 378
  • [22] Latency Control for Keyword Spotting
    Jose, Christin
    Wang, Joseph
    Strimel, Grant P.
    Khursheed, Mohammad Omar
    Mishchenko, Yuriy
    Kulis, Brian
    [J]. INTERSPEECH 2022, 2022, : 1891 - 1895
  • [23] A New Keyword Spotting Approach
    Bahi, Halima
    Benati, Nadia
    [J]. 2009 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS 2009), 2009, : 77 - +
  • [24] Improved Keyword Spotting based on Keyword/Garbage Models
    Chen, Qiyu
    Zhang, Weibin
    Xu, Xiangmin
    Xing, Xiaofen
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [25] Keyword Spotting with Quaternionic ResNet: Application to Spotting in Greek Manuscripts
    Sfikas, Giorgos
    Retsinas, George
    Giotis, Angelos P.
    Gatos, Basilis
    Nikou, Christophoros
    [J]. DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 382 - 396
  • [26] Keyword spotting on Korean document images by matching the keyword image
    Kim, SH
    Park, SC
    Jeong, CB
    Kim, JS
    Park, HR
    Lee, GS
    [J]. DIGITAL LIBRARIES: IMPLEMENTING STRATEGIES AND SHARING EXPERIENCES, PROCEEDINGS, 2005, 3815 : 158 - 166
  • [27] KFA: Keyword Feature Augmentation for Open Set Keyword Spotting
    Ko, Kyungdeuk
    Lee, Bokyeung
    Hong, Jonghwan
    Ko, Hanseok
    [J]. IEEE Signal Processing Letters, 2024, 31 : 2985 - 2989
  • [28] A novel keyword rescoring method for improved spoken keyword spotting
    Rebai, Ilyes
    BenAyed, Yassine
    Mahdi, Walid
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 312 - 320
  • [29] Acoustic Similarity Scores for Keyword Spotting
    Veiga, Arlindo
    Lopes, Carla
    Sa, Luis
    Perdigao, Fernando
    [J]. COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, 2014, 8775 : 48 - 58
  • [30] Neural Architecture Search For Keyword Spotting
    Mo, Tong
    Yu, Yakun
    Salameh, Mohammad
    Niu, Di
    Jui, Shangling
    [J]. INTERSPEECH 2020, 2020, : 1982 - 1986