Dysarthric speech classification from coded telephone speech using glottal features

被引:32
|
作者
Narendra, N. P. [1 ]
Alku, Paavo [1 ]
机构
[1] Aalto Univ, Dept Signal Proc & Acoust, Espoo 00076, Finland
基金
芬兰科学院;
关键词
Dysarthric speech; Glottal parameters; Glottal source estimation; Glottal inverse filtering; OpenSMILE; Support vector machines; Telemonitoring; PARKINSONS-DISEASE; INTELLIGIBILITY; DATABASE; MODELS; VOICE;
D O I
10.1016/j.specom.2019.04.003
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes a new dysarthric speech classification method from coded telephone speech using glottal features. The proposed method utilizes glottal features, which are efficiently estimated from coded telephone speech using a recently proposed deep neural net-based glottal inverse filtering method. Two sets of glottal features were considered: (1) time- and frequency-domain parameters and (2) parameters based on principal component analysis (PCA). In addition, acoustic features are extracted from coded telephone speech using the openSMILE toolkit. The proposed method utilizes both acoustic and glottal features extracted from coded speech utterances and their corresponding dysarthric/healthy labels to train support vector machine classifiers. Separate classifiers are trained using both individual, and the combination of glottal and acoustic features. The coded telephone speech used in the experiments is generated using the adaptive multi-rate codec, which operates in two transmission bandwidths: narrowband (300 Hz - 3.4 kHz) and wideband (50 Hz - 7 kHz). The experiments were conducted using dysarthric and healthy speech utterances of the TORGO and universal access speech (UA-Speech) databases. Classification accuracy results indicated the effectiveness of glottal features in the identification of dysarthria from coded telephone speech. The results also showed that the glottal features in combination with the openSMILE-based acoustic features resulted in improved classification accuracies, which validate the complementary nature of glottal features. The proposed dysarthric speech classification method can potentially be employed in telemonitoring application for identifying the presence of dysarthria from coded telephone speech.
引用
收藏
页码:47 / 55
页数:9
相关论文
共 50 条
  • [41] Speaker verification from coded telephone speech using stochastic feature transformation and handset identification
    Yu, EWM
    Mak, MW
    Kung, SY
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2002, PROCEEDING, 2002, 2532 : 598 - 606
  • [42] Determination of glottal closure instants from clean and telephone quality speech signals using single frequency filtering
    Kadiri, Sudarsana Reddy
    Yegnanarayana, B.
    COMPUTER SPEECH AND LANGUAGE, 2020, 64
  • [43] Using articulatory likelihoods in the recognition of dysarthric speech
    Rudzicz, Frank
    SPEECH COMMUNICATION, 2012, 54 (03) : 430 - 444
  • [44] EMOTION CLASSIFICATION OF SPEECH USING MODULATION FEATURES
    Chaspari, Theodora
    Dimitriadis, Dimitrios
    Maragos, Petros
    2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1552 - 1556
  • [45] Speech Emotion Classification using Acoustic Features
    Chen, Shizhe
    Jin, Qin
    Li, Xirong
    Yang, Gang
    Xu, Jieping
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 579 - 583
  • [46] Analytic phase features for dysarthric speech detection and intelligibility assessment
    Gurugubelli, Krishna
    Vuppala, Anil Kumar
    SPEECH COMMUNICATION, 2020, 121 : 1 - 15
  • [47] Acoustic features to characterize sentence accent production in dysarthric speech
    Ramos, Viviana Mendoza
    Kairuz Hernandez-Diaz, Hector A.
    Huici, Maria E. Hernandez-Diaz
    Martens, Heidi
    Van Nuffelen, Gwen
    De Bodt, Marc
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2020, 57
  • [48] Speech Intelligibility Assessment of Dysarthric Speech by using Goodness of Pronunciation with Uncertainty Quantification
    Yeo, Eun Jung
    Choi, Kwanghee
    Kim, Sunhee
    Chung, Minhwa
    INTERSPEECH 2023, 2023, : 166 - 170
  • [49] MULTILINGUAL ANALYSIS OF INTELLIGIBILITY CLASSIFICATION USING ENGLISH, KOREAN, AND TAMIL DYSARTHRIC SPEECH DATASETS
    Yeo, Eun Jung
    Kim, Sunhee
    Chung, Minhwa
    2022 25TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA 2022), 2022,
  • [50] Classification-Based Detection of Glottal Closure Instants from Speech Signals
    Matousek, Jindrich
    Tihelka, Daniel
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3053 - 3057