On Improving Acoustic Models For TORGO Dysarthric Speech Database

被引:8
|
作者
Joy, Neethu Mariam [1 ]
Umesh, S. [1 ]
Abraham, Basil [1 ]
机构
[1] Indian Inst Technol Madras, Madras, Tamil Nadu, India
关键词
Dysarthria; TORGO; GMM-HMM; DNN; RECOGNITION;
D O I
10.21437/Interspeech.2017-878
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Assistive technologies based on speech have been shown to improve the quality of life of people affected with dysarthria, a motor speech disorder. Multiple ways to improve Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network (DNN) based automatic speech recognition (ASR) systems for TORGO database for dysarthric speech are explored in this paper. Past attempts in developing ASR systems for TORGO database were limited to training just mono phone models and doing speaker adaptation over them. Although a recent work attempted training triphone and neural network models, parameters like the number of context dependent states, dimensionality of the principal component features etc were not properly tuned. This paper develops speaker specific ASR models for each dysarthric speaker in TORGO database by tuning parameters of GMM-HMM model, number of layers and hidden nodes in DNN. Employing dropout scheme and sequence discriminative training in DNN also gave significant gains. Speaker adapted features like feature-space maximum likelihood linear regression (FMLLR) are used to pass the speaker information to DNNs. To the best of our knowledge, this paper presents the best recognition accuracies for TORGO database till date.
引用
收藏
页码:2695 / 2699
页数:5
相关论文
共 50 条
  • [41] Improving speech intelligibility in cochlear implants using acoustic models
    Vijayalakshmi, P.
    Nagarajan, T.
    Mahadevan, Preethi
    WSEAS Transactions on Signal Processing, 2011, 7 (04): : 131 - 144
  • [42] Unfamiliar listeners' evaluation of speech supplementation strategies for improving the effectiveness of severely dysarthric speech
    Hustad, Katherine C.
    AAC: Augmentative and Alternative Communication, 2001, 17 (04): : 213 - 220
  • [43] TWO-STEP ACOUSTIC MODEL ADAPTATION FOR DYSARTHRIC SPEECH RECOGNITION
    Takashima, Ryoichi
    Takiguchi, Tetsuya
    Ariki, Yasuo
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6104 - 6108
  • [44] Generating synthetic dysarthric speech to overcome dysarthria acoustic data scarcity
    Hu, Andrew
    Phadnis, Dhruv
    Shahamiri, Seyed Reza
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (6) : 6751 - 6768
  • [45] Classification of Dysarthric Speech According to the Severity of Impairment: an Analysis of Acoustic Features
    Al-Qatab, Bassam Ali
    Mustafa, Mumtaz Begum
    IEEE ACCESS, 2021, 9 : 18183 - 18194
  • [46] Generating synthetic dysarthric speech to overcome dysarthria acoustic data scarcity
    Andrew Hu
    Dhruv Phadnis
    Seyed Reza Shahamiri
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 6751 - 6768
  • [47] Formant Centralization Ratio: A Proposal for a New Acoustic Measure of Dysarthric Speech
    Sapir, Shimon
    Ramig, Lorraine O.
    Spielman, Jennifer L.
    Fox, Cynthia
    JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH, 2010, 53 (01): : 114 - 125
  • [48] Automatic assessments of dysarthric speech: the usability of acoustic-phonetic features
    van Bemmel, Loes
    Pesenti, Chiara
    Wei, Xue
    Strik, Helmer
    INTERSPEECH 2023, 2023, : 141 - 145
  • [49] Comparative analysis of deep learning models for dysarthric speech detection
    P. Shanmugapriya
    V. Mohan
    Soft Computing, 2024, 28 : 5683 - 5698
  • [50] Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation
    Bhat, Chitralekha
    Vachhani, Bhavik
    Kopparapu, Sunil
    Speech and Computer, 2016, 9811 : 370 - 377