Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech

被引:20
|
作者
Yilmaz, Emre [1 ,2 ]
Mitra, Vikramjit [3 ,6 ]
Sivaraman, Ganesh [4 ]
Franco, Horacio [5 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] Radboud Univ Nijmegen, CLS CLST, Nijmegen, Netherlands
[3] Univ Maryland, College Pk, MD USA
[4] Pindrop, Atlanta, GA USA
[5] SRI Int, Star Lab, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
[6] Apple Inc, Cupertino, CA 95014 USA
来源
关键词
Pathological speech; Automatic speech recognition; Articulatory features; Time-frequency convolutional neural networks; Dysarthria; DEEP NEURAL-NETWORK; INTELLIGIBILITY; RECOGNITION; PARAMETERS; THERAPY; INDIVIDUALS; INTENSITY; KNOWLEDGE; STROKE; IMPACT;
D O I
10.1016/j.csl.2019.05.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid population aging has stimulated the development of assistive devices that provide personalized medical support to the needies suffering from various etiologies. One prominent clinical application is a computer-assisted speech training system which enables personalized speech therapy to patients impaired by communicative disorders in the patient's home environment. Such a system relies on the robust automatic speech recognition (ASR) technology to be able to provide accurate articulation feedback. With the long-term aim of developing off-the-shelf ASR systems that can be incorporated in clinical context without prior speaker information, we compare the ASR performance of speaker-independent bottleneck and articulatory features on dysarthric speech used in conjunction with dedicated neural network-based acoustic models that have been shown to be robust against spectrotemporal deviations. We report ASR performance of these systems on two dysarthric speech datasets of different characteristics to quantify the achieved performance gains. Despite the remaining performance gap between the dysarthric and normal speech, significant improvements have been reported on both datasets using speaker-independent ASR architectures. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:319 / 334
页数:16
相关论文
共 50 条
  • [1] Across-speaker Articulatory Normalization for Speaker-independent Silent Speech Recognition
    Wang, Jun
    Samal, Ashok
    Green, Jordan R.
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1179 - 1183
  • [2] Speaker-Independent Silent Speech Recognition with Across-Speaker Articulatory Normalization and Speaker Adaptive Training
    Wang, Jun
    Hahm, Seongjun
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2415 - 2419
  • [3] Speaker-Independent Speech Recognition using Visual Features
    Pooventhiran, G.
    Sandeep, A.
    Manthiravalli, K.
    Harish, D.
    Renuka, Karthika D.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (11) : 616 - 620
  • [4] Hardware oriented architectures for continuous-speech speaker-independent ASR systems
    Cardarilli, GC
    Malatesta, A
    Re, M
    Arnone, L
    Bocchio, S
    Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004, : 346 - 352
  • [5] Multi-speaker articulatory trajectory formation based on speaker-independent articulatory HMMs
    Hiroya, Sadao
    Mochida, Takemi
    SPEECH COMMUNICATION, 2006, 48 (12) : 1677 - 1690
  • [6] Articulatory Features for ASR of Pathological Speech
    Yilmaz, Emre
    Mitra, Vikramjit
    Bartels, Chris
    Franco, Horacio
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2958 - 2962
  • [7] SPEAKER-INDEPENDENT CONTINUOUS SPEECH DICTATION
    GAUVAIN, JL
    LAMEL, LF
    ADDA, G
    ADDADECKER, M
    SPEECH COMMUNICATION, 1994, 15 (1-2) : 21 - 37
  • [8] The study on continuous speech of speaker-independent
    Ye Hong
    CHINESE JOURNAL OF ELECTRONICS, 2006, 15 (4A): : 921 - 924
  • [9] Autoregressive Articulatory WaveNet Flow for Speaker-Independent Acoustic-to-Articulatory Inversion
    Bozorg, Narjes
    Johnson, Michael T.
    Soleymanpour, Mohammad
    2021 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2021, : 156 - 161
  • [10] Independent and Automatic Evaluation of Speaker-Independent Acoustic-to-Articulatory Reconstruction
    Parrot, Maud
    Millet, Juliette
    Dunbar, Ewan
    INTERSPEECH 2020, 2020, : 3740 - 3744