Using articulatory likelihoods in the recognition of dysarthric speech

被引:27
|
作者
Rudzicz, Frank [1 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON, Canada
关键词
Dysarthria; Speech recognition; Acoustic-articulatory inversion; Task-dynamics; RECOVERING ARTICULATION; MODEL;
D O I
10.1016/j.specom.2011.10.006
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Millions of individuals have congenital or acquired neuro-motor conditions that limit control of their muscles, including those that manipulate the vocal tract. These conditions, collectively called dysarthria, result in speech that is very difficult to understand both by human listeners and by traditional automatic speech recognition (ASR), which in some cases can be rendered completely unusable. In this work we first introduce a new method for acoustic-to-articulatory inversion which estimates positions of the vocal tract given acoustics using a nonlinear Hammerstein system. This is accomplished based on the theory of task-dynamics using the TORGO database of dysarthric articulation. Our approach uses adaptive kernel canonical correlation analysis and is found to be significantly more accurate than mixture density networks, at or above the 95% level of confidence for most vocal tract variables. Next, we introduce a new method for ASR in which acoustic-based hypotheses are re-evaluated according to the likelihoods of their articulatory realizations in task-dynamics. This approach incorporates high-level, long-term aspects of speech production and is found to be significantly more accurate than hidden Markov models, dynamic Bayesian networks, and switching Kalman filters. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:430 / 444
页数:15
相关论文
共 50 条
  • [1] Articulatory Knowledge in the Recognition of Dysarthric Speech
    Rudzicz, Frank
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 947 - 960
  • [2] APPLYING DISCRETIZED ARTICULATORY KNOWLEDGE TO DYSARTHRIC SPEECH
    Rudzicz, Frank
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4501 - 4504
  • [3] EXPLORING ARTICULATORY CHARACTERISTICS OF CANTONESE DYSARTHRIC SPEECH USING DISTINCTIVE FEATURES
    Wong, Ka Ho
    Yeung, Wing Sum
    Yeung, Yu Ting
    Meng, Helen
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6495 - 6499
  • [4] SYNTHESIZING DYSARTHRIC SPEECH USING MULTI-SPEAKER TTS FOR DYSARTHRIC SPEECH RECOGNITION
    Soleymanpour, Mohammad
    Johnson, Michael T.
    Soleymanpour, Rahim
    Berry, Jeffrey
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7382 - 7386
  • [5] MULTI-MODAL ACOUSTIC-ARTICULATORY FEATURE FUSION FOR DYSARTHRIC SPEECH RECOGNITION
    Yue, Zhengjun
    Loweimi, Erfan
    Cvetkovic, Zoran
    Christensen, Heidi
    Barker, Jon
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7372 - 7376
  • [6] Assessment of articulatory sub-systems of dysarthric speech using an isolated-style phoneme recognition system
    Vijayalakshmi, P.
    Reddy, M. R.
    O'Shaughnessy, Douglas
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 981 - +
  • [7] Using speech rhythm knowledge to improve dysarthric speech recognition
    Selouani, S. -A.
    Dahmani, H.
    Amami, R.
    Hamam, H.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (01) : 57 - 64
  • [8] Data Augmentation using Healthy Speech for Dysarthric Speech Recognition
    Vachhani, Bhavik
    Bhat, Chitralekha
    Kopparapu, Sunil Kumar
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 471 - 475
  • [9] Using speech rhythm knowledge to improve dysarthric speech recognition
    S.-A. Selouani
    H. Dahmani
    R. Amami
    H. Hamam
    [J]. International Journal of Speech Technology, 2012, 15 (1) : 57 - 64
  • [10] ACOUSTIC-TO-ARTICULATORY INVERSION FOR DYSARTHRIC SPEECH BY USING CROSS-CORPUS ACOUSTIC-ARTICULATORY DATA
    Maharana, Sarthak Kumar
    Illa, Aravind
    Mannem, Renuka
    Belur, Yamini
    Shetty, Preetie
    Kumar, Veeramani Preethish
    Vengalil, Seena
    Polavarapu, Kiran
    Atchayaram, Nalini
    Ghosh, Prasanta Kumar
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6458 - 6462