On Improving Acoustic Models For TORGO Dysarthric Speech Database

被引：8

作者：

Joy, Neethu Mariam ^{[1
]}

Umesh, S. ^{[1
]}

Abraham, Basil ^{[1
]}

机构：

[1] Indian Inst Technol Madras, Madras, Tamil Nadu, India

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

关键词：

Dysarthria; TORGO; GMM-HMM; DNN; RECOGNITION;

D O I：

10.21437/Interspeech.2017-878

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Assistive technologies based on speech have been shown to improve the quality of life of people affected with dysarthria, a motor speech disorder. Multiple ways to improve Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network (DNN) based automatic speech recognition (ASR) systems for TORGO database for dysarthric speech are explored in this paper. Past attempts in developing ASR systems for TORGO database were limited to training just mono phone models and doing speaker adaptation over them. Although a recent work attempted training triphone and neural network models, parameters like the number of context dependent states, dimensionality of the principal component features etc were not properly tuned. This paper develops speaker specific ASR models for each dysarthric speaker in TORGO database by tuning parameters of GMM-HMM model, number of layers and hidden nodes in DNN. Employing dropout scheme and sequence discriminative training in DNN also gave significant gains. Speaker adapted features like feature-space maximum likelihood linear regression (FMLLR) are used to pass the speaker information to DNNs. To the best of our knowledge, this paper presents the best recognition accuracies for TORGO database till date.

引用

页码：2695 / 2699

页数：5

共 50 条

[21] SOME ACOUSTIC CHARACTERISTICS OF DYSARTHRIC SPEECH - LEHISTE,I
WINCKEL, F
FOLIA PHONIATRICA, 1966, 18 (03): : 228 - 228
[22] Clinics issue - Acoustic studies of dysarthric speech - Introduction
Mayo, R
JOURNAL OF COMMUNICATION DISORDERS, 1999, 32 (03) : 139 - 140
[23] LEHISTE,I - SOME ACOUSTIC CHARACTERISTICS OF DYSARTHRIC SPEECH
TRUBY, HM
PHONETICA, 1966, 15 (3-4) : 243 - &
[24] SOME ACOUSTIC CHARACTERISTICS OF DYSARTHRIC SPEECH - LEHISTE,I
NOLL, JD
JOURNAL OF SPEECH AND HEARING DISORDERS, 1967, 32 (01): : 91 - 91
[25] Integration of metamodel and acoustic model for dysarthric speech recognition
Matsumasa, Hironori
Takiguchi, Tetsuya
Ariki, Yasuo
Li, I-Chao
Nakabayashi, Toshitaka
Journal of Multimedia, 2009, 4 (04): : 254 - 261
[26] Acoustic studies of dysarthric speech: Methods, progress, and potential
Kent, RD
Weismer, G
Kent, JF
Vorperian, HK
Duffy, JR
JOURNAL OF COMMUNICATION DISORDERS, 1999, 32 (03) : 141 - 186
[27] Improving Speech to Text Alignment Based on Repetition Detection for Dysarthric Speech
G. Diwakar
Veena Karjigi
Circuits, Systems, and Signal Processing, 2020, 39 : 5543 - 5567
[28] Comparing Speaker-Dependent and Speaker-Adaptive Acoustic Models for Recognizing Dysarthric Speech
Rudzicz, Frank
ASSETS'07: PROCEEDINGS OF THE NINTH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS AND ACCESSIBILITY, 2007, : 255 - 256
[29] Improving Speech to Text Alignment Based on Repetition Detection for Dysarthric Speech
Diwakar, G.
Karjigi, Veena
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (11) : 5543 - 5567
[30] Improving the Intelligibility of Dysarthric Speech Towards Enhancing the Effectiveness of Speech Therapy
Kumar, S. Arun
Kumar, C. Santhosh
2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 1000 - 1005

← 1 2 3 4 5 →