DYSARTHRIC SPEECH RECOGNITION WITH LATTICE-FREE MMI

被引:0
|
作者
Hermann, Enno [1 ,2 ]
Magimai-Doss, Mathew [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne EPFL, Lausanne, Switzerland
基金
欧盟地平线“2020”;
关键词
Speech recognition; pathological speech processing; dysarthria; LF-MMI; ASR;
D O I
10.1109/icassp40776.2020.9053549
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.
引用
收藏
页码:6109 / 6113
页数:5
相关论文
共 50 条
  • [41] Using speech rhythm knowledge to improve dysarthric speech recognition
    Selouani, S. -A.
    Dahmani, H.
    Amami, R.
    Hamam, H.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 57 - 64
  • [42] A survey of technologies for automatic Dysarthric speech recognition
    Zhaopeng Qian
    Kejing Xiao
    Chongchong Yu
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [43] Inequalities for the lattice width of lattice-free convex sets in the plane
    Gennadiy Averkov
    Christian Wagner
    Beiträge zur Algebra und Geometrie / Contributions to Algebra and Geometry, 2012, 53 (1): : 1 - 23
  • [44] Lifting properties of maximal lattice-free polyhedra
    Gennadiy Averkov
    Amitabh Basu
    Mathematical Programming, 2015, 154 : 81 - 111
  • [45] Data Augmentation using Healthy Speech for Dysarthric Speech Recognition
    Vachhani, Bhavik
    Bhat, Chitralekha
    Kopparapu, Sunil Kumar
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 471 - 475
  • [46] A Strategic Approach for Robust Dysarthric Speech Recognition
    Revathi, A.
    Sasikaladevi, N.
    Arunprasanth, D.
    Amirtharajan, Rengarajan
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 134 (04) : 2315 - 2346
  • [47] Interface of an Automatic Recognition System for Dysarthric Speech
    Zaidi, Brahim-Fares
    Boudraa, Malika
    Selouani, Sid-Ahmed
    Addou, Djamel
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (09) : 560 - 564
  • [48] Towards the Improvement of Automatic Recognition of Dysarthric Speech
    Tolba, Hesham
    EL Torgoman, Ahmed S.
    2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 1, 2009, : 277 - +
  • [49] Using speech rhythm knowledge to improve dysarthric speech recognition
    S.-A. Selouani
    H. Dahmani
    R. Amami
    H. Hamam
    International Journal of Speech Technology, 2012, 15 (1) : 57 - 64
  • [50] Dysarthric speech: A comparison of computerized speech recognition and listener intelligibility
    Doyle, PC
    Leeper, HA
    Kotler, AL
    ThomasStonell, N
    ONeill, C
    Dylke, MC
    Rolls, K
    JOURNAL OF REHABILITATION RESEARCH AND DEVELOPMENT, 1997, 34 (03): : 309 - 316