MODELING UNCERTAINTY IN PREDICTING EMOTIONAL ATTRIBUTES FROM SPONTANEOUS SPEECH

被引:0
|
作者
Sridhar, Kusha [1 ]
Busso, Carlos [1 ]
机构
[1] Univ Texas Dallas, Multimodal Signal Proc MSP Lab, Dept Elect & Comp Engn, Richardson, TX 75080 USA
关键词
Speech Emotion Recognition; Monte Carlo dropout; activation functions; reject option; RECOGNITION; CORPUS;
D O I
10.1109/icassp40776.2020.9054237
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A challenging task in affective computing is to build reliable speech emotion recognition (SER) systems that can accurately predict emotional attributes from spontaneous speech. To increase the trust in these SER systems, it is important to predict not only their accuracy, but also their confidence. An intriguing approach to predict uncertainty is Monte Carlo (MC) dropout, which obtains predictions from multiple feed-forward passes through a deep neural network (DNN) by using dropout regularization in both training and inference. This study evaluates this approach with regression models to predict emotional attribute scores for valence, arousal and dominance. The analysis illustrates that predicting uncertainty in this problem is possible, where the performance is higher for samples in the test set with lower uncertainty. The study evaluates uncertainty estimation as a function of the emotional attributes, showing that samples with extreme values have lower uncertainty. Finally, we demonstrate the benefits of uncertainty estimation with reject option, where a classifier can decline to give a prediction when its confidence is low. By rejecting only 25% of the test set with the highest uncertainty, we achieve relative performance gains of 7.34% for arousal, 13.73% for valence and 8.79% for dominance.
引用
收藏
页码:8384 / 8388
页数:5
相关论文
共 50 条
  • [1] Generative Approach Using Soft-Labels to Learn Uncertainty in Predicting Emotional Attributes
    Sridhar, Kusha
    Lin, Wei-Cheng
    Busso, Carlos
    2021 9TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2021,
  • [2] Storing prosody attributes of spontaneous speech
    Klecková, J
    TEXT, SPEECH AND DIALOGUE, 1999, 1692 : 268 - 273
  • [3] Important Nonverbal Attributes for Spontaneous Speech Recognition
    Kleckova, Jana
    2009 FOURTH INTERNATIONAL CONFERENCE ON SYSTEMS (ICONS), 2009, : 13 - 16
  • [4] Sources of uncertainty in ecological modelling: Predicting vegetation types from environmental attributes
    Dale M.B.
    Dale P.E.R.
    Community Ecology, 2004, 5 (2) : 203 - 225
  • [5] Predicting Depression Severity from Spontaneous Speech as Prompted by a Virtual Agent
    Konig, A.
    Mina, M.
    Schaefer, S.
    Linz, N.
    Troeger, J.
    EUROPEAN PSYCHIATRY, 2023, 66 : S157 - S158
  • [6] Assessment of spontaneous emotional speech database toward emotion recognition: Intensity and similarity of perceived emotion from spontaneously expressed emotional speech
    Arimoto, Yoshiko
    Ohno, Sumio
    Iida, Hitoshi
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2011, 32 (01) : 26 - 29
  • [7] Spontaneous attention to emotional speech in Japan and the United States
    Ishii, K
    Kitayama, S
    AFFECTIVE MINDS, 2000, : 243 - 248
  • [8] Emotion recognition from spontaneous speech using emotional vowel-like regions
    Md Shah Fahad
    Shreya Singh
    Ashish Abhinav
    Akshay Ranjan
    Multimedia Tools and Applications, 2022, 81 : 14025 - 14043
  • [9] Emotion recognition from spontaneous speech using emotional vowel-like regions
    Fahad, Md Shah
    Singh, Shreya
    Abhinav
    Ranjan, Ashish
    Deepak, Akshay
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (10) : 14025 - 14043
  • [10] Analysis of prosodic features: towards modelling of emotional and pragmatic attributes of speech
    Adell, Jordi
    Bonafonte, Antonio
    Escudero, David
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (35): : 277 - 283