ONLINE WORD-SPOTTING IN CONTINUOUS SPEECH WITH RECURRENT NEURAL NETWORKS

被引:0
|
作者
Baljekar, Pallavi [1 ,2 ]
Lehman, Jill Fain [2 ]
Singh, Rita [1 ,2 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Disney Res, Pittsburgh, PA USA
关键词
Continuous speech; Online word-spotting; Speech recognition; Recurrent neural networks; Gated networks;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we introduce a simplified architecture for gated recurrent neural networks that can be used in single-pass applications, where word-spotting needs to be done in real-time and phoneme-level information is not available for training. The network operates as a self-contained block in a strictly forward-pass configuration to directly generate keyword labels. We call these simple networks causal networks, where the current output is only weighted by the the past inputs and outputs. Since the basic network has a simpler architecture as compared to traditional memory networks used in keyword spotting, it also requires less data to train. Experiments on a standard speech database highlight the behavior and efficacy of such networks. Comparisons with a standard HMM-based keyword spotter show that these networks, while simple, are still more accurate.
引用
收藏
页码:536 / 541
页数:6
相关论文
共 50 条
  • [31] INTEGRATED PHONEME AND FUNCTION WORD ARCHITECTURE OF HIDDEN CONTROL NEURAL NETWORKS FOR CONTINUOUS SPEECH RECOGNITION
    PETEK, B
    WAIBEL, AH
    TEBELSKIS, JM
    SPEECH COMMUNICATION, 1992, 11 (2-3) : 273 - 282
  • [32] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
    Arik, Sercan O.
    Kliegl, Markus
    Child, Rewon
    Hestness, Joel
    Gibiansky, Andrew
    Fougner, Chris
    Prenger, Ryan
    Coates, Adam
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
  • [33] Speech prediction using recurrent neural networks
    Varoglu, E
    Hacioglu, K
    ELECTRONICS LETTERS, 1999, 35 (16) : 1353 - 1355
  • [34] Recurrent Fuzzy Neural Networks for Speech Detection
    Wu, Gin-Der
    Zhu, Zhen-Wei
    2015 INTERNATIONAL CONFERENCE ON FUZZY THEORY AND ITS APPLICATIONS (IFUZZY), 2015, : 18 - 21
  • [35] SPEECH RECOGNITION WITH HIERARCHICAL RECURRENT NEURAL NETWORKS
    CHEN, WY
    LIAO, YF
    CHEN, SH
    PATTERN RECOGNITION, 1995, 28 (06) : 795 - 805
  • [36] A RECURRENT NEURAL NETWORKS APPROACH FOR KEYWORD SPOTTING APPLIED ON ROMANIAN LANGUAGE
    Pipa, Sonia
    Boros, Tiberiu
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2016, : 111 - 120
  • [37] Visual speech recognition by recurrent neural networks
    Rabi, G
    Lu, SW
    JOURNAL OF ELECTRONIC IMAGING, 1998, 7 (01) : 61 - 69
  • [38] Visual speech recognition by recurrent neural networks
    Rabi, G
    Lu, SW
    1997 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS I AND II: ENGINEERING INNOVATION: VOYAGE OF DISCOVERY, 1997, : 55 - 58
  • [39] SPEECH RECOGNITION WITH DEEP RECURRENT NEURAL NETWORKS
    Graves, Alex
    Mohamed, Abdel-rahman
    Hinton, Geoffrey
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6645 - 6649
  • [40] Unfolded Recurrent Neural Networks for Speech Recognition
    Saon, George
    Soltau, Hagen
    Emami, Ahmad
    Picheny, Michael
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 343 - 347