SEMI-SUPERVISED TRAINING OF ACOUSTIC MODELS USING LATTICE-FREE MMI

被引:0
|
作者
Manohar, Vimal [1 ,2 ]
Hadian, Hossein [1 ]
Povey, Daniel [1 ,2 ]
Khudanpur, Sanjeev [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
基金
美国国家科学基金会;
关键词
Semi-supervised training; Lattice-free MMI; Sequence training; Automatic speech recognition; SPEECH RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The lattice-free MMI objective (LF-MMI) has been used in supervised training of state-of-the-art neural network acoustic models for automatic speech recognition (ASR). With large amounts of unsupervised data available, extending this approach to the semi-supervised scenario is of significance. Finite-state transducer (FST) based supervision used with LF-MMI provides a natural way to incorporate uncertainties when dealing with unsupervised data. In this paper, we describe various extensions to standard LF-MMI training to allow the use as supervision of lattices obtained via decoding of unsupervised data. The lattices are rescored with a strong LM. We investigate different methods for splitting the lattices and incorporating frame tolerances into the supervision FST. We report results on different subsets of Fisher English, where we achieve WER recovery of 59-64% using lattice supervision, which is significantly better than using just the best path transcription.
引用
收藏
页码:4844 / 4848
页数:5
相关论文
共 50 条
  • [1] LATTICE-FREE MMI ADAPTATION OF SELF-SUPERVISED PRETRAINED ACOUSTIC MODELS
    Vyas, ApoorV
    Madikeri, Srikanth
    Bourlard, Herve
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6219 - 6223
  • [2] On Semi-Supervised LF-MMI Training of Acoustic Models with Limited Data
    Sheikh, Imran
    Vincent, Emmanuel
    Illina, Irina
    INTERSPEECH 2020, 2020, : 986 - 990
  • [3] Unbiased semi-supervised LF-MMI training using dropout
    Tong, Sibo
    Vyas, Apoorv
    Garner, Philip N.
    Bourlard, Herve
    INTERSPEECH 2019, 2019, : 1576 - 1580
  • [4] DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI
    Hermann, Enno
    Magimai-Doss, Mathew
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6109 - 6113
  • [5] CONTINUAL LEARNING USING LATTICE-FREE MMI FOR SPEECH RECOGNITION
    Hadian, Hossein
    Gorin, Arseniy
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6522 - 6526
  • [6] CONSISTENT TRAINING AND DECODING FOR END-TO-END SPEECH RECOGNITION USING LATTICE-FREE MMI
    Tian, Jinchuan
    Yu, Jianwei
    Weng, Chao
    Zhang, Shi-Xiong
    Su, Dan
    Yu, Dong
    Zou, Yuexian
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7782 - 7786
  • [7] Lattice-free State-level Minimum Bayes Risk Training of Acoustic Models
    Kanda, Naoyuki
    Fujita, Yusuke
    Nagamatsu, Kenji
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2923 - 2927
  • [8] Domain adaptation of lattice-free MMI based TDNN models for speech recognition
    Long Y.
    Li Y.
    Ye H.
    Mao H.
    International Journal of Speech Technology, 2017, 20 (1) : 171 - 178
  • [9] End-to-end speech recognition using lattice-free MMI
    Hadian, Hossein
    Sameti, Hossein
    Povey, Daniel
    Khudanpur, Sanjeev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 12 - 16
  • [10] Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech Recognition
    Li, Jie
    Fan, Zhiyun
    Wang, Xiaorui
    Li, Yan
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,