NEURAL LATTICE SEARCH FOR SPEECH RECOGNITION

被引:0
|
作者
Ma, Rao [1 ]
Li, Hao [1 ]
Liu, Qi [1 ]
Chen, Lu [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, MoE Key Lab Artificial Intelligence, Dept Comp Sci & Engn, SpeechLab, Shanghai, Peoples R China
关键词
speech recognition; word lattice; lattice-to-sequence; attention models; forward-backward algorithm;
D O I
10.1109/icassp40776.2020.9054109
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
To improve the accuracy of automatic speech recognition, a two-pass decoding strategy is widely adopted. The first-pass model generates compact word lattices, which are utilized by the second-pass model to perform rescoring. Currently, the most popular rescoring methods are N-best rescoring and lattice rescoring with long short-term memory language models (LSTMLMs). However, these methods encounter the problem of limited search space or inconsistency between training and evaluation. In this paper, we address these problems with an end-to-end model for accurately extracting the best hypothesis from the word lattice. Our model is composed of a bidirectional LatticeLSTM encoder followed by an attentional LSTM decoder. The model takes word lattice as input and generates the single best hypothesis from the given lattice space. When combined with an LSTMLM, the proposed model yields 9.7% and 7.5% relative WER reduction compared to N -best rescoring methods and lattice rescoring methods within the same amount of decoding time.
引用
收藏
页码:7794 / 7798
页数:5
相关论文
共 50 条
  • [1] NEURAL ARCHITECTURE SEARCH FOR SPEECH EMOTION RECOGNITION
    Wu, Xixin
    Hu, Shoukang
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6902 - 6906
  • [2] LATENCY-CONTROLLED NEURAL ARCHITECTURE SEARCH FOR STREAMING SPEECH RECOGNITION
    He, Liqiang
    Feng, Shulin
    Su, Dan
    Yu, Dong
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 62 - 67
  • [3] EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition
    Sun, Haiyang
    Lian, Zheng
    Liu, Bin
    Li, Ying
    Sun, Licai
    Cai, Cong
    Tao, Jianhua
    Wang, Meng
    Cheng, Yuan
    INTERSPEECH 2023, 2023, : 3597 - 3601
  • [4] Integration of speech recognition and machine translation: Speech recognition word lattice translation
    Zhang, RQ
    Kikui, G
    SPEECH COMMUNICATION, 2006, 48 (3-4) : 321 - 334
  • [5] NEURAL ARRAYS FOR SPEECH RECOGNITION
    TATTERSALL, GD
    LINFORD, PW
    LINGGARD, R
    BRITISH TELECOM TECHNOLOGY JOURNAL, 1988, 6 (02): : 140 - 163
  • [6] Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition
    Kim, Jihwan
    Wang, Jisung
    Kim, Sangki
    Lee, Yeha
    INTERSPEECH 2020, 2020, : 1788 - 1792
  • [7] MULTILINGUAL SPEECH EMOTION RECOGNITION WITH MULTI-GATING MECHANISM AND NEURAL ARCHITECTURE SEARCH
    Wang, Zihan
    Meng, Qi
    Lan, HaiFeng
    Zhang, XinRui
    Guo, KeHao
    Gupta, Akshat
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 806 - 813
  • [8] Automatic Speech Recognition by Cuckoo Search Optimization based Artificial Neural Network Classifier
    Mendiratta, Sunanda
    Turk, Neelam
    Bansal, Dipali
    2015 INTERNATIONAL CONFERENCE ON SOFT COMPUTING TECHNIQUES AND IMPLEMENTATIONS (ICSCTI), 2015,
  • [9] Segmental search for continuous speech recognition
    Laface, P
    Fissore, L
    Maro, A
    Ravera, F
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2155 - 2158
  • [10] Cuckoo Search Algorithm for Speech Recognition
    Ghose, Rahul
    Das, Tejes
    Chattopadhyay, Soummyo Priyo
    Das, Tiyasha
    Saha, Ayoshna
    2015 INTERNATIONAL CONFERENCE AND WORKSHOP ON COMPUTING AND COMMUNICATION (IEMCON), 2015,