Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition

被引:24
|
作者
Chen, X. [1 ]
Ragni, A. [1 ]
Liu, X. [2 ]
Gales, M. J. F. [1 ]
机构
[1] Univ Cambridge, Engn Dept, Cambridge, England
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
关键词
language model; bidirectional recurrent neural network; speech recognition; interpolation;
D O I
10.21437/Interapeech.2017-513
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recurrent neural network language models (RNNLMs) are powerful language modeling techniques. Significant performance improvements have been reported in a range of tasks including speech recognition compared to n-gram language models. Conventional n-gram and neural network language models are trained to predict the probability of the next word given its preceding context history. In contrast, bidirectional recurrent neural network based language models consider the context from future words as well. This complicates the inference process. but has theoretical benefits for tasks such as speech recognition as additional context information can be used. However to date, very limited or no gains in speech recognition performance have been reported with this form of model. This paper examines the issues of training bidirectional recurrent neural network language models (bi-RNNLMs) for speech recognition. A bi-RNNLM probability smoothing technique is proposed, that addresses the very sharp posteriors that are often observed in these models. The performance of the bi-RNNLMs is evaluated on three speech recognition tasks: broadcast news: meeting transcription (AMI); and low-resource systems (Babel data). On all tasks gains are observed by applying the smoothing technique to the bi-RNNLM. In addition consistent performance gains can be obtained by combining bi-RNNLMs with n-gram and uni-directional RNNLMs.
引用
收藏
页码:269 / 273
页数:5
相关论文
共 50 条
  • [1] BIDIRECTIONAL RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Arisoy, Ebru
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    Chen, Stanley
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5421 - 5425
  • [2] GAUSSIAN PROCESS LSTM RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR SPEECH RECOGNITION
    Lam, Max W. Y.
    Chen, Xie
    Hu, Shoukang
    Yu, Jianwei
    Liu, Xunying
    Meng, Helen
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7235 - 7239
  • [3] Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition
    Masumura, Ryo
    Asami, Taichi
    Oba, Takanobu
    Sakauchi, Sumitaka
    Ito, Akinori
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (12) : 2557 - 2567
  • [4] Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition
    Chen, Xie
    Liu, Xunying
    Wang, Yongqiang
    Gales, Mark J. F.
    Woodland, Philip C.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2146 - 2157
  • [5] LIMITED-MEMORY BFGS OPTIMIZATION OF RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR SPEECH RECOGNITION
    Liu, Xunying
    Liu, Shansong
    Sha, Jinze
    Yu, Jianwei
    Xu, Zhiyuan
    Chen, Xie
    Meng, Helen
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6114 - 6118
  • [6] Comparison of Various Neural Network Language Models in Speech Recognition
    Zuo, Lingyun
    Liu, Jian
    Wan, Xin
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2016, : 894 - 898
  • [7] A Speech Recognition System for Bengali Language using Recurrent Neural Network
    Islam, Jahirul
    Mubassira, Masiath
    Islam, Md. Rakibul
    Das, Amit Kumar
    [J]. 2019 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION SYSTEMS (ICCCS 2019), 2019, : 73 - 76
  • [8] Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition
    Li, Ke
    Xu, Hainan
    Wang, Yiming
    Povey, Daniel
    Khudanpur, Sanjeev
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3373 - 3377
  • [9] Recurrent Neural Network Language Model with Part-of-speech for Mandarin Speech Recognition
    Gong, Caixia
    Li, Xiangang
    Wu, Xihong
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 459 - 463
  • [10] Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition
    Lecorve, Gwenole
    Motlicek, Petr
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1666 - 1669