Aphasic Speech Recognition using a Mixture of Speech Intelligibility Experts

被引:5
|
作者
Perez, Matthew [1 ]
Aldeneh, Zakaria [1 ]
Provost, Emily Mower [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
来源
基金
美国国家科学基金会;
关键词
disordered speech recognition; aphasia; speech intelligibility; mixture of experts; automatic speech recognition; SYSTEM;
D O I
10.21437/Interspeech.2020-2049
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Robust speech recognition is a key prerequisite for semantic feature extraction in automatic aphasic speech analysis. However, standard one-size-fits-all automatic speech recognition models perform poorly when applied to aphasic speech. One reason for this is the wide range of speech intelligibility due to different levels of severity (i.e., higher severity lends itself to less intelligible speech). To address this, we propose a novel acoustic model based on a mixture of experts (MoE), which handles the varying intelligibility stages present in aphasic speech by explicitly defining severity-based experts. At test time, the contribution of each expert is decided by estimating speech intelligibility with a speech intelligibility detector (SID). We show that our proposed approach significantly reduces phone error rates across all severity stages in aphasic speech compared to a baseline approach that does not incorporate severity information into the modeling process.
引用
收藏
页码:4986 / 4990
页数:5
相关论文
共 50 条
  • [1] A MIXTURE OF EXPERTS APPROACH TOWARDS INTELLIGIBILITY CLASSIFICATION OF PATHOLOGICAL SPEECH
    Gupta, Rahul
    Audhkhasi, Kartik
    Narayanan, Shrikanth
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 1986 - 1990
  • [2] Estimation of Speech Intelligibility Using Speech Recognition Systems
    Takano, Yusuke
    Kondo, Kazuhiro
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (12): : 3368 - 3376
  • [3] MIXTURE OF INFORMED EXPERTS FOR MULTILINGUAL SPEECH RECOGNITION
    Gaur, Neeraj
    Farris, Brian
    Haghani, Parisa
    Leal, Isabel
    Moreno, Pedro J.
    Prasad, Manasa
    Ramabhadran, Bhuvana
    Zhu, Yun
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6234 - 6238
  • [4] A Multi-Accent Acoustic Model using Mixture of Experts for Speech Recognition
    Jain, Abhinav
    Singh, Vishwanath P.
    Rath, Shakti P.
    [J]. INTERSPEECH 2019, 2019, : 779 - 783
  • [5] On development of multimodal named entity recognition using part-of-speech and mixture of experts
    Chen, Jianying
    Xue, Yun
    Zhang, Haolan
    Ding, Weiping
    Zhang, Zhengxuan
    Chen, Jiehai
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (06) : 2181 - 2192
  • [6] On development of multimodal named entity recognition using part-of-speech and mixture of experts
    Jianying Chen
    Yun Xue
    Haolan Zhang
    Weiping Ding
    Zhengxuan Zhang
    Jiehai Chen
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 2181 - 2192
  • [7] Using Automatic Speech Recognition to Measure the Intelligibility of Speech Synthesized from Brain Signals
    Varshney, Suvi
    Farias, Dana
    Brandman, David M.
    Stavisky, Sergey D.
    Miller, Lee M.
    [J]. 2023 11TH INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING, NER, 2023,
  • [8] Speech recognition for multiple bands: Implications for the Speech Intelligibility Index
    Humes, Larry E.
    Kidd, Gary R.
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2016, 140 (03): : 2019 - 2026
  • [9] Dysarthric speech: A comparison of computerized speech recognition and listener intelligibility
    Doyle, PC
    Leeper, HA
    Kotler, AL
    ThomasStonell, N
    ONeill, C
    Dylke, MC
    Rolls, K
    [J]. JOURNAL OF REHABILITATION RESEARCH AND DEVELOPMENT, 1997, 34 (03): : 309 - 316
  • [10] Autonomous measurement of speech intelligibility utilizing automatic speech recognition
    Meyer, Bernd T.
    Kollmeier, Birger
    Ooster, Jasper
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2982 - 2986