On the Impact of Children's Emotional Speech on Acoustic and Language Models

被引:13
|
作者
Steidl, Stefan [1 ]
Batliner, Anton [1 ]
Seppi, Dino [2 ]
Schuller, Bjoern [3 ]
机构
[1] Univ Erlangen Nurnberg, Lehrstuhl Mustererkennung, D-91058 Erlangen, Germany
[2] Katholieke Univ Leuven, ESAT, B-3001 Louvain, Belgium
[3] Tech Univ Munich, Inst Human Machine Commun, D-80333 Munich, Germany
关键词
RECOGNITION; ASR;
D O I
10.1155/2010/783954
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The automatic recognition of children's speech is well known to be a challenge, and so is the influence of affect that is believed to downgrade performance of a speech recogniser. In this contribution, we investigate the combination of both phenomena. Extensive test runs are carried out for 1 k vocabulary continuous speech recognition on spontaneous motherese, emphatic, and angry children's speech as opposed to neutral speech. The experiments address the question how specific emotions influence word accuracy. In a first scenario, "emotional" speech recognisers are compared to a speech recogniser trained on neutral speech only. For this comparison, equal amounts of training data are used for each emotion-related state. In a second scenario, a "neutral" speech recogniser trained on large amounts of neutral speech is adapted by adding only some emotionally coloured data in the training process. The results show that emphatic and angry speech is recognised best-even better than neutral speech-and that the performance can be improved further by adaptation of the acoustic and linguistic models. In order to show the variability of emotional speech, we visualise the distribution of the four emotion-related states in the MFCC space by applying a Sammon transformation.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] On the Impact of Children's Emotional Speech on Acoustic and Language Models
    Stefan Steidl
    Anton Batliner
    Dino Seppi
    Björn Schuller
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2010
  • [2] Acoustic and Language Modeling for Children's Read Speech Assessment
    Tulsiani, Hitesh
    Swarup, Prakhar
    Rao, Preeti
    [J]. 2017 TWENTY-THIRD NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2017,
  • [3] Simultaneous Adaptation of Acoustic and Language Models for Emotional Speech Recognition Using Tweet Data
    Kosaka, Tetsuo
    Saeki, Kazuya
    Aizawa, Yoshitaka
    Kato, Masaharu
    Nose, Takashi
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (03) : 363 - 373
  • [4] An Intervention Study of Language Cognition and Emotional Speech Community Method for Children's Speech Disorders
    Qiang, Yali
    [J]. INTERNATIONAL JOURNAL OF MENTAL HEALTH PROMOTION, 2023, 25 (05) : 627 - 637
  • [5] Study on the Treatment of Children's Speech Disorder by Language Cognition Emotional Speech Community Method
    Qing, Yali
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2021, 128 : 41 - 41
  • [6] Ageing impact on acoustic correlates of speech emotional prosody
    Dmitrieva, E.
    Gelman, V
    Zaitseva, K.
    Orlov, A.
    [J]. PSYCHOLOGY & HEALTH, 2009, 24 : 158 - 159
  • [7] Learning Language and Acoustic Models for Identifying Alzheimer's Dementia From Speech
    Shah, Zehra
    Sawalha, Jeffrey
    Tasnim, Mashrura
    Qi, Shi-ang
    Stroulia, Eleni
    Greiner, Russell
    [J]. FRONTIERS IN COMPUTER SCIENCE, 2021, 3
  • [8] SPEECH RECOGNITION - ACOUSTIC, PHONETIC AND FORMAL-LANGUAGE MODELS
    MERMELSTEIN, P
    LEVINSON, S
    [J]. BIOTELEMETRY, 1975, 2 (1-2) : 121 - 123
  • [9] Acoustic and Language Models Adaptation for Indonesian Spontaneous Speech Recognition
    Lestari, Dessi Puji
    Irfani, Angela
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON ADVANCED INFORMATICS: CONCEPTS, THEORY AND APPLICATIONS ICAICTA, 2015,
  • [10] Adaptation strategies for the acoustic and language models in bilingual speech transcription
    Dieguez-Tirado, J
    Garcia-Mateo, C
    Docio-Fernandez, L
    Cardenal-Lopez, A
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 833 - 836