Dysarthric speech classification from coded telephone speech using glottal features

被引：32

作者：

Narendra, N. P. ^{[1
]}

Alku, Paavo ^{[1
]}

机构：

[1] Aalto Univ, Dept Signal Proc & Acoust, Espoo 00076, Finland

来源：

SPEECH COMMUNICATION | 2019年 / 110卷

基金：

芬兰科学院;

关键词：

Dysarthric speech; Glottal parameters; Glottal source estimation; Glottal inverse filtering; OpenSMILE; Support vector machines; Telemonitoring; PARKINSONS-DISEASE; INTELLIGIBILITY; DATABASE; MODELS; VOICE;

D O I：

10.1016/j.specom.2019.04.003

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed method utilizes glottal features, which are efficiently estimated from coded telephone speech using a recently proposed deep neural net-based glottal inverse filtering method. Two sets of glottal features were considered: (1) time- and frequency-domain parameters and (2) parameters based on principal component analysis (PCA). In addition, acoustic features are extracted from coded telephone speech using the openSMILE toolkit. The proposed method utilizes both acoustic and glottal features extracted from coded speech utterances and their corresponding dysarthric/healthy labels to train support vector machine classifiers. Separate classifiers are trained using both individual, and the combination of glottal and acoustic features. The coded telephone speech used in the experiments is generated using the adaptive multi-rate codec, which operates in two transmission bandwidths: narrowband (300 Hz - 3.4 kHz) and wideband (50 Hz - 7 kHz). The experiments were conducted using dysarthric and healthy speech utterances of the TORGO and universal access speech (UA-Speech) databases. Classification accuracy results indicated the effectiveness of glottal features in the identification of dysarthria from coded telephone speech. The results also showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech.

引用

页码：47 / 55

页数：9

共 50 条

[41] Speaker verification from coded telephone speech using stochastic feature transformation and handset identification
Yu, EWM
Mak, MW
Kung, SY
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 598 - 606
[42] Determination of glottal closure instants from clean and telephone quality speech signals using single frequency filtering
Kadiri, Sudarsana Reddy
Yegnanarayana, B.
COMPUTER SPEECH AND LANGUAGE, 2020, 64
[43] Using articulatory likelihoods in the recognition of dysarthric speech
Rudzicz, Frank
SPEECH COMMUNICATION, 2012, 54 (03) : 430 - 444
[44] EMOTION CLASSIFICATION OF SPEECH USING MODULATION FEATURES
Chaspari, Theodora
Dimitriadis, Dimitrios
Maragos, Petros
2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1552 - 1556
[45] Speech Emotion Classification using Acoustic Features
Chen, Shizhe
Jin, Qin
Li, Xirong
Yang, Gang
Xu, Jieping
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 579 - 583
[46] Analytic phase features for dysarthric speech detection and intelligibility assessment
Gurugubelli, Krishna
Vuppala, Anil Kumar
SPEECH COMMUNICATION, 2020, 121 : 1 - 15
[47] Acoustic features to characterize sentence accent production in dysarthric speech
Ramos, Viviana Mendoza
Kairuz Hernandez-Diaz, Hector A.
Huici, Maria E. Hernandez-Diaz
Martens, Heidi
Van Nuffelen, Gwen
De Bodt, Marc
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2020, 57
[48] Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Uncertainty Quantification
Yeo, Eun Jung
Choi, Kwanghee
Kim, Sunhee
Chung, Minhwa
INTERSPEECH 2023, 2023, : 166 - 170
[49] MULTILINGUAL ANALYSIS OF INTELLIGIBILITY CLASSIFICATION USING ENGLISH, KOREAN, AND TAMIL DYSARTHRIC SPEECH DATASETS
Yeo, Eun Jung
Kim, Sunhee
Chung, Minhwa
2022 25TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA 2022), 2022,
[50] Classification-Based Detection of Glottal Closure Instants from Speech Signals
Matousek, Jindrich
Tihelka, Daniel
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3053 - 3057

← 1 2 3 4 5 →