Acoustic Features and Neural Representations for Categorical Emotion Recognition from Speech

被引:17
|
作者
Keesing, Aaron [1 ]
Koh, Yun Sing [1 ]
Witbrock, Michael [1 ]
机构
[1] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
来源
关键词
speech emotion recognition; computational paralinguistics; affective computing; CORPUS;
D O I
10.21437/Interspeech.2021-2217
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Many features have been proposed for use in speech emotion recognition, from signal processing features to bag-of-audio-words (BoAW) models to abstract neural representations. Some of these feature types have not been directly compared across a large number of speech corpora to determine performance differences. We propose a full factorial design and to compare speech processing features, BoAW and neural representations on 17 emotional speech datasets. We measure the performance of features in a categorical emotion classification problem for each dataset, using speaker-independent cross-validation with diverse classifiers. Results show statistically significant differences between features and between classifiers, with large effect sizes between features. In particular, standard acoustic feature sets still perform competitively to neural representations, while neural representations have a larger range of performance, and BoAW features lie in the middle. The best and worst neural representations were wav2veq and VGGish, respectively, with wav2vec performing best out of all tested features. These results indicate that standard acoustic feature sets are still very useful baselines for emotional classification, but high quality neural speech representations can be better.
引用
收藏
页码:3415 / 3419
页数:5
相关论文
共 50 条
  • [1] SPEECH EMOTION RECOGNITION WITH COMPLEMENTARY ACOUSTIC REPRESENTATIONS
    Zhang, Xiaoming
    Zhang, Fan
    Cui, Xiaodong
    Zhang, Wei
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 846 - 852
  • [2] Emotion recognition from speech using deep recurrent neural networks with acoustic features
    Byun, Sung-Woo
    Shin, Bo-Ra
    Lee, Seok-Pil
    Han, Hyuk-Soo
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2018, 123 : 43 - 44
  • [3] Novel acoustic features for speech emotion recognition
    Yong-Wan Roh
    Dong-Ju Kim
    Woo-Seok Lee
    Kwang-Seok Hong
    [J]. Science in China Series E: Technological Sciences, 2009, 52 : 1838 - 1848
  • [4] Novel acoustic features for speech emotion recognition
    ROH Yong-Wan
    KIM Dong-Ju
    LEE Woo-Seok
    HONG Kwang-Seok
    [J]. Science China Technological Sciences, 2009, 52 (07) : 1838 - 1848
  • [5] SPEECH EMOTION RECOGNITION WITH ACOUSTIC AND LEXICAL FEATURES
    Jin, Qin
    Li, Chengxin
    Chen, Shizhe
    Wu, Huimin
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4749 - 4753
  • [6] Novel acoustic features for speech emotion recognition
    Roh Yong-Wan
    Kim Dong-Ju
    Lee Woo-Seok
    Hong Kwang-Seok
    [J]. SCIENCE IN CHINA SERIES E-TECHNOLOGICAL SCIENCES, 2009, 52 (07): : 1838 - 1848
  • [7] Emotion recognition from telephone speech using acoustic and nonlinear features
    Bedoya-Jaramillo, S.
    Orozco-Arroyave, J. R.
    Arias-Londono, J. D.
    Vargas-Bonilla, J. F.
    [J]. 2013 47TH INTERNATIONAL CARNAHAN CONFERENCE ON SECURITY TECHNOLOGY (ICCST), 2013,
  • [8] Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features
    Lee, Ming-Che
    Yeh, Sheng-Cheng
    Chang, Jia-Wei
    Chen, Zhen-Yi
    [J]. SENSORS, 2022, 22 (13)
  • [9] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
    Bhangale, Kishor
    Kothandaraman, Mohanaprasad
    [J]. ELECTRONICS, 2023, 12 (04)
  • [10] Spontaneous Children's Emotion Recognition by Categorical Classification of Acoustic Features
    Planet, Santiago
    Iriondo, Ignasi
    [J]. SISTEMAS E TECNOLOGIAS DE INFORMACAO, VOL I, 2011, : 594 - +