DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI

被引:0
|
作者
Hermann, Enno [1 ,2 ]
Magimai-Doss, Mathew [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne EPFL, Lausanne, Switzerland
基金
欧盟地平线“2020”;
关键词
Speech recognition; pathological speech processing; dysarthria; LF-MMI; ASR;
D O I
10.1109/icassp40776.2020.9053549
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.
引用
收藏
页码:6109 / 6113
页数:5
相关论文
共 50 条
  • [31] ON LATTICE-FREE BOOSTED MMI TRAINING OF HMM AND CTC-BASED FULL-CONTEXT ASR MODELS
    Zhang, Xiaohui
    Manohar, Vimal
    Zhang, David
    Zhang, Frank
    Shi, Yangyang
    Singhal, Nayan
    Chan, Julian
    Peng, Fuchun
    Saraf, Yatharth
    Seltzer, Mike
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 1026 - 1033
  • [32] PHONETIC ANALYSIS OF DYSARTHRIC SPEECH TEMPO AND APPLICATIONS TO ROBUST PERSONALISED DYSARTHRIC SPEECH RECOGNITION
    Xiong, Feifei
    Barker, Jon
    Christensen, Heidi
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5836 - 5840
  • [33] Investigating Lattice-Free Acoustic Modeling for Children Automatic Speech Recognition in Low-Resource Settings Under Mismatched Conditions
    Kadyan V.
    Bawa P.
    Choudhary R.
    SN Computer Science, 5 (5)
  • [34] Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech
    Calvo, Irene
    Tropea, Peppino
    Vigano, Mauro
    Scialla, Maria
    Cavalcante, Agnieszka B.
    Grajzer, Monika
    Gilardone, Marco
    Corbo, Massimo
    FOLIA PHONIATRICA ET LOGOPAEDICA, 2021, 73 (05) : 432 - 441
  • [35] Noise Robust Automatic Scoring Based on Deep Neural Network Acoustic Models with Lattice-Free MMI and Factorized Adaptation
    Luo, Dean
    Xia, Linzhong
    Guan, Mingxiang
    MOBILE NETWORKS & APPLICATIONS, 2022, 27 (04): : 1604 - 1611
  • [36] Noise Robust Automatic Scoring Based on Deep Neural Network Acoustic Models with Lattice-Free MMI and Factorized Adaptation
    Dean Luo
    Linzhong Xia
    Mingxiang Guan
    Mobile Networks and Applications, 2022, 27 : 1604 - 1611
  • [37] Optimizing Vocabulary Modeling for Dysarthric Speech Recognition
    Na, Minsoo
    Chung, Minhwa
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PT II (ICCHP 2016), 2016, 9759 : 507 - 510
  • [38] A survey of technologies for automatic Dysarthric speech recognition
    Qian, Zhaopeng
    Xiao, Kejing
    Yu, Chongchong
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [39] Using articulatory likelihoods in the recognition of dysarthric speech
    Rudzicz, Frank
    SPEECH COMMUNICATION, 2012, 54 (03) : 430 - 444
  • [40] ON LATTICE WIDTH OF LATTICE-FREE POLYHEDRA AND HEIGHT OF HILBERT BASES
    Henk, Martin
    Kuhlmann, Stefan
    Weismantel, Robert
    SIAM JOURNAL ON DISCRETE MATHEMATICS, 2022, 36 (03) : 1918 - 1942