DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI

被引:0
|
作者
Hermann, Enno [1 ,2 ]
Magimai-Doss, Mathew [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne EPFL, Lausanne, Switzerland
基金
欧盟地平线“2020”;
关键词
Speech recognition; pathological speech processing; dysarthria; LF-MMI; ASR;
D O I
10.1109/icassp40776.2020.9053549
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.
引用
收藏
页码:6109 / 6113
页数:5
相关论文
共 50 条
  • [1] CONTINUAL LEARNING USING LATTICE-FREE MMI FOR SPEECH RECOGNITION
    Hadian, Hossein
    Gorin, Arseniy
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6522 - 6526
  • [2] Integrating Lattice-Free MMI Into End-to-End Speech Recognition
    Tian, Jinchuan
    Yu, Jianwei
    Weng, Chao
    Zou, Yuexian
    Yu, Dong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 (25-38) : 25 - 38
  • [3] End-to-end speech recognition using lattice-free MMI
    Hadian, Hossein
    Sameti, Hossein
    Povey, Daniel
    Khudanpur, Sanjeev
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 12 - 16
  • [4] Domain adaptation of lattice-free MMI based TDNN models for speech recognition
    Long Y.
    Li Y.
    Ye H.
    Mao H.
    International Journal of Speech Technology, 2017, 20 (1) : 171 - 178
  • [5] Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech Recognition
    Li, Jie
    Fan, Zhiyun
    Wang, Xiaorui
    Li, Yan
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [6] CONSISTENT TRAINING AND DECODING FOR END-TO-END SPEECH RECOGNITION USING LATTICE-FREE MMI
    Tian, Jinchuan
    Yu, Jianwei
    Weng, Chao
    Zhang, Shi-Xiong
    Su, Dan
    Yu, Dong
    Zou, Yuexian
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7782 - 7786
  • [7] Multitask adaptation with Lattice-Free MMI for multi-genre speech recognition of low resource languages
    Madikeri, Srikanth
    Motlicek, Petr
    Bourlard, Herve
    INTERSPEECH 2021, 2021, : 4329 - 4333
  • [8] WakeWord Detection with Alignment-Free Lattice-Free MMI
    Wang, Yiming
    Lv, Hang
    Povey, Daniel
    Xie, Lei
    Khudanpur, Sanjeev
    INTERSPEECH 2020, 2020, : 4258 - 4262
  • [9] Confidence estimation for lattice-based and lattice-free automatic speech recognition
    Caranica, Alexandru
    Oneaţă, Dan
    Cucu, Horia
    Burileanu, Corneliu
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2021, 83 (03): : 155 - 170
  • [10] CONFIDENCE ESTIMATION FOR LATTICE-BASED AND LATTICE-FREE AUTOMATIC SPEECH RECOGNITION
    Caranica, Alexandru
    Oneata, Dan
    Cucu, Horia
    Burileanu, Corneliu
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2021, 83 (03): : 155 - 170