Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition

被引:3
|
作者
Sheikh, Imran [1 ,2 ,3 ,4 ]
Illina, Irina [1 ,2 ,3 ]
Fohr, Dominique [1 ,2 ,3 ]
Linares, Georges [4 ]
机构
[1] Univ Lorraine, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
[2] Inria, F-54600 Villers Les Nancy, France
[3] CNRS, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
[4] Univ Avignon, Lab Informat Avignon, Avignon, France
关键词
lvcsr; oov; proper names;
D O I
10.21437/Interspeech.2016-1219
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Many Proper Names (PNs) are Out-Of-Vocabulary (OOV) words for speech recognition systems used to process diachronic audio data. To enable recovery of the PNs missed by the system, relevant OOV PNs can be retrieved by exploiting the semantic context of the spoken content. In this paper, we explore the Neural Bag-of-Words (NB OW) model, proposed previously for text classification, to retrieve relevant OOV PNs. We propose a Neural Bag-of-Weighted-Words (NBOW2) model in which the input embedding layer is augmented with a context anchor layer. This layer learns to assign importance to input words and has the ability to capture (task specific) key-words in a NBOW model. With experiments on French broadcast news videos we show that the NBOW and NBOW2 models outperform earlier methods based on raw embeddings from LDA and Skip-gram. Combining NBOW with NBOW2 gives faster convergence during training.
引用
收藏
页码:675 / 679
页数:5
相关论文
共 50 条
  • [1] Bag-of-words Modelling for Speech Recognition
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    [J]. INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 646 - +
  • [2] SPEECH RECOGNITION OF FOREIGN OUT-OF-VOCABULARY WORDS USING A HIERARCHICAL LANGUAGE MODEL
    Yamamoto, Hirofumi
    Kikui, Genichiro
    Nakamura, Satoshi
    Sagisaka, Yoshinori
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1870 - +
  • [3] Phoneme-to-grapheme conversion for out-of-vocabulary words in large vocabulary speech recognition
    Decadt, B
    Duchateau, J
    Daelemans, W
    Wambacq, P
    [J]. ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 413 - 416
  • [4] A category based approach for recognition of out-of-vocabulary words
    Gallwitz, F
    Noth, E
    Niemann, H
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 228 - 231
  • [5] An improved two-stage mixed language model approach for handling out-of-vocabulary words in large vocabulary continuous speech recognition
    Reveil, Bert
    Demuynck, Kris
    Martens, Jean-Pierre
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 141 - 162
  • [6] Transcription of out-of-vocabulary words in large vocabulary speech recognition based on phoneme-to-grapheme conversion
    Decadt, B
    Duchateau, J
    Daelemans, W
    Wambacq, P
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 861 - 864
  • [7] RNN Language Model Estimation for Out-of-Vocabulary Words
    Illina, Irina
    Fohr, Dominique
    [J]. HUMAN LANGUAGE TECHNOLOGY. CHALLENGES FOR COMPUTER SCIENCE AND LINGUISTICS, LTC 2017, 2020, 12598 : 199 - 211
  • [8] Finding Recurrent Out-of-Vocabulary Words
    Qin, Long
    Rudnicky, Alexander
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2241 - 2245
  • [9] A bag-of-words equivalent recurrent neural network for action recognition
    Richard, Alexander
    Gall, Juergen
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 156 : 79 - 91
  • [10] Lexicon Stratification for Translating Out-of-Vocabulary Words
    Tsvetkov, Yulia
    Dyer, Chris
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 125 - 131