Integrating Recurrence Dynamics for Speech Emotion Recognition

被引：11

作者：

Tzinis, Efthymios ^{[1
,2
]}

Paraskevopoulos, Georgios ^{[1
,2
]}

Baziotis, Christos ^{[1
]}

Potamianos, Alexandros ^{[1
,2
]}

机构：

[1] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens, Greece

[2] Behav Signal Technol, Los Angeles, CA USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

基金：

欧盟地平线“2020”;

关键词：

speech emotion recognition; recurrence quantification analysis; nonlinear dynamics; recurrence plots; FEATURES; FRAMEWORK;

D O I：

10.21437/Interspeech.2018-1377

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of its respective Recurrence Plot (RP) reveals complex structures which can be measured by performing Recurrence Quantification Analysis (RQA). These measures are aggregated by using statistical functionals over segment and utterance periods. We report SER results for the proposed feature set on three databases using different classification methods. When fusing the proposed features with traditional feature sets, e.g., [1], we show an improvement in unweighted accuracy of up to 5.7% and 10.7% on Speaker-Dependent (SD) and Speaker Independent (SI) SER tasks, respectively, over the baseline [1]. Following a segment-based approach we demonstrate state-of-the-art performance on IEMOCAP using a Bidirectional Recurrent Neural Network.

引用

页码：927 / 931

页数：5

共 50 条

[1] Integrating Language and Emotion Features for Multilingual Speech Emotion Recognition
Heracleous, Panikos
Mohammad, Yasser
Yoneyama, Akio
[J]. HUMAN-COMPUTER INTERACTION. MULTIMODAL AND NATURAL INTERACTION, HCI 2020, PT II, 2020, 12182 : 187 - 196
[2] A multimodal emotion recognition model integrating speech, video and MoCAP
Ning Jia
Chunjun Zheng
Wei Sun
[J]. Multimedia Tools and Applications, 2022, 81 : 32265 - 32286
[3] A multimodal emotion recognition model integrating speech, video and MoCAP
Jia, Ning
Zheng, Chunjun
Sun, Wei
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (22) : 32265 - 32286
[4] Continuous Emotion Recognition in Speech - Do We Need Recurrence?
Schmitt, Maximilian
Cummins, Nicholas
Schuller, Bjoern
[J]. INTERSPEECH 2019, 2019, : 2808 - 2812
[5] Speech emotion recognition using nonlinear dynamics features
Shahzadi, Ali
Ahmadyfard, Alireza
Harimi, Ali
Yaghmaie, Khashayar
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2015, 23 : 2056 - 2073
[6] Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method
Jia, Ning
Zheng, Chunjun
[J]. COMPUTER COMMUNICATIONS, 2021, 180 : 161 - 170
[7] Speech Emotion Recognition
Lalitha, S.
Madhavan, Abhishek
Bhushan, Bharath
Saketh, Srinivas
[J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
[8] Recognition of Emotion Using Non-Linear Dynamics of Speech
Harimi, Ali
Shalizadi, Ali
Ahmadyfard, Alireza
[J]. 2014 7th International Symposium on Telecommunications (IST), 2014, : 446 - 451
[9] ANGER OR JOY? EMOTION RECOGNITION USING NONLINEAR DYNAMICS OF SPEECH
Harimi, Ali
AhmadyFard, Alireza
Shahzadi, Ali
Yaghmaie, Khashayar
[J]. APPLIED ARTIFICIAL INTELLIGENCE, 2015, 29 (07) : 675 - 696
[10] Emotion Recognition During Speech Using Dynamics of Multiple Regions of the Face
Kim, Yelin
Provost, Emily Mower
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2015, 12 (01)

← 1 2 3 4 5 →