Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

被引:12
|
作者
Korzekwa, Daniel [1 ]
Barra-Chicote, Roberto [1 ]
Kostek, Bozena [2 ]
Drugman, Thomas [1 ]
Lajszczak, Mateusz [1 ]
机构
[1] Amazon TTS Res, Cambridge, England
[2] Gdansk Univ Technol, Fac ETI, Gdansk, Poland
来源
关键词
dysarthria detection; speech recognition; speech synthesis; interpretable deep learning models;
D O I
10.21437/Interspeech.2019-1206
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
We present a novel deep learning model for the detection and reconstruction of dysarthric speech. We train the model with a multi-task learning technique to jointly solve dysarthria detection and speech reconstruction tasks. The model key feature is a low-dimensional latent space that is meant to encode the properties of dysarthric speech. It is commonly believed that neural networks are black boxes that solve problems but do not provide interpretable outputs. On the contrary, we show that this latent space successfully encodes interpretable characteristics of dysarthria, is effective at detecting dysarthria, and that manipulation of the latent space allows the model to reconstruct healthy speech from dysarthric speech. This work can help patients and speech pathologists to improve their understanding of the condition, lead to more accurate diagnoses and aid in reconstructing healthy speech for afflicted patients.
引用
收藏
页码:3890 / 3894
页数:5
相关论文
共 50 条
  • [21] Deep Autoencoder based Speech Features for Improved Dysarthric Speech Recognition
    Vachhani, Bhavik
    Bhat, Chitralekha
    Das, Biswajit
    Kopparapu, Sunil Kumar
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1854 - 1858
  • [22] Comparative landmark detection on stops of dysarthric speech
    Sunitha, S. V.
    Shivaputra
    Soundeswaran, S.
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 79
  • [23] Comparative analysis of different time-frequency image representations for the detection and severity classification of dysarthric speech using deep learning
    Aurobindo, S.
    Prakash, R.
    Rajeshkumar, M.
    RESULTS IN ENGINEERING, 2025, 25
  • [24] Interpretable ensemble deep learning model for early detection of Alzheimer's disease using local interpretable model-agnostic explanations
    Aghaei, Atefe
    Moghaddam, Mohsen Ebrahimi
    Malek, Hamed
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2022, 32 (06) : 1889 - 1902
  • [25] A Novel Interpretable Deep Learning Model for Ozone Prediction
    Chen, Xingguo
    Li, Yang
    Xu, Xiaoyan
    Shao, Min
    APPLIED SCIENCES-BASEL, 2023, 13 (21):
  • [26] Clinical Interpretable Deep Learning Model for Glaucoma Diagnosis
    Liao, WangMin
    Zou, BeiJi
    Zhao, RongChang
    Chen, YuanQiong
    He, ZhiYou
    Zhou, MengJie
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (05) : 1405 - 1412
  • [27] An Interpretable Deep Learning Model for Automatic Sound Classification
    Zinemanas, Pablo
    Rocamora, Martin
    Miron, Marius
    Font, Frederic
    Serra, Xavier
    ELECTRONICS, 2021, 10 (07)
  • [28] Using interpretable deep learning to model cancer dependencies
    Lin, Chih-Hsu
    Lichtarge, Olivier
    BIOINFORMATICS, 2021, 37 (17) : 2675 - 2681
  • [29] Fully interpretable deep learning model of transcriptional control
    Liu, Yi
    Barr, Kenneth
    Reinitz, John
    BIOINFORMATICS, 2020, 36 : 499 - 507
  • [30] An interpretable wide and deep model for online disinformation detection
    Chai, Yidong
    Liu, Yi
    Li, Weifeng
    Zhu, Bin
    Liu, Hongyan
    Jiang, Yuanchun
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237