Speech Emotion Recognition with Emotion-Pair based Framework Considering Emotion Distribution Information in Dimensional Emotion Space

被引：16

作者：

Ma, Xi ^{[1
,3
]}

Wu, Zhiyong ^{[1
,2
,3
]}

Jia, Jia ^{[1
,3
]}

Xu, Mingxing ^{[1
,3
]}

Meng, Helen ^{[1
,2
]}

Cai, Lianhong ^{[1
,3
]}

机构：

[1] Tsinghua Univ, Grad Sch Shenzhen, Tsinghua CHUK Joint Res Ctr Media Sci Technol & S, Shenzhen 518055, Peoples R China

[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China

[3] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol TNList, Beijing 100084, Peoples R China

来源：

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION | 2017年

基金：

国家高技术研究发展计划(863计划); 中国国家自然科学基金;

关键词：

speech emotion recognition; emotion-pair; dimensional emotion space; Naive Bayes classifier; BINARY;

D O I：

10.21437/Interspeech.2017-619

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, an emotion-pair based framework is proposed for speech emotion recognition, which constructs more discriminative feature subspaces for every two different emotions (emotion-pair) to generate more precise emotion bi-classification results. Furthermore, it is found that in the dimensional emotion space, the distances between some of the archetypal emotions are closer than the others. Motivated by this, a Naive Bayes classifier based decision fusion strategy is proposed, which aims at capturing such useful emotion distribution information in deciding the final emotion category for emotion recognition. We evaluated the classification framework on the USC IEMOCAP database. Experimental results demonstrate that the proposed method outperforms the hierarchical binary decision tree approach on both weighted accuracy (WA) and unweighted accuracy (UA). Moreover. our framework possesses the advantages that it can be fully automatically generated without empirical guidance and is easier to be parallelized.

引用

页码：1238 / 1242

页数：5

共 50 条

[1] Speech emotion recognition based on emotion perception
Liu, Gang
Cai, Shifang
Wang, Ce
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
[2] Speech emotion recognition based on emotion perception
Gang Liu
Shifang Cai
Ce Wang
EURASIP Journal on Audio, Speech, and Music Processing, 2023
[3] Emotion Prompting for Speech Emotion Recognition
Zhou, Xingfa
Li, Min
Yang, Lan
Sun, Rui
Wang, Xin
Zhan, Huayi
INTERSPEECH 2023, 2023, : 3108 - 3112
[4] Personalized Emotion Recognition Considering Situational Information and Time Variance of Emotion
Seol, Yong-Soo
Kim, Han-Woo
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (11) : 2409 - 2416
[5] Preserving Actual Dynamic Trend of Emotion in Dimensional Speech Emotion Recognition
Han, Wenjing
Li, Haifeng
Eyben, Florian
Ma, Lin
Sun, Jiayin
Schuller, Bjoern
ICMI '12: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2012, : 523 - 528
[6] Autoencoder With Emotion Embedding for Speech Emotion Recognition
Zhang, Chenghao
Xue, Lei
IEEE ACCESS, 2021, 9 : 51231 - 51241
[7] Autoencoder with emotion embedding for speech emotion recognition
Zhang, Chenghao
Xue, Lei
IEEE Access, 2021, 9 : 51231 - 51241
[8] Speech Emotion Recognition Model Based on Joint Modeling of Discrete and Dimensional Emotion Representation
Bautista, John Lorenzo
Shin, Hyun Soon
APPLIED SCIENCES-BASEL, 2025, 15 (02):
[9] Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation
Fu, Hongliang
Li, Qianqian
Tao, Huawei
Zhu, Chunhua
Xie, Yue
Guo, Ruxue
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (08) : 1097 - 1100
[10] Meta-Learning for Speech Emotion Recognition Considering Ambiguity of Emotion Labels
Fujioka, Takuya
Homma, Takeshi
Nagamatsu, Kenji
INTERSPEECH 2020, 2020, : 2332 - 2336

← 1 2 3 4 5 →