Integrating Recurrence Dynamics for Speech Emotion Recognition

被引:11
|
作者
Tzinis, Efthymios [1 ,2 ]
Paraskevopoulos, Georgios [1 ,2 ]
Baziotis, Christos [1 ]
Potamianos, Alexandros [1 ,2 ]
机构
[1] Natl Tech Univ Athens, Sch Elect & Comp Engn, Athens, Greece
[2] Behav Signal Technol, Los Angeles, CA USA
基金
欧盟地平线“2020”;
关键词
speech emotion recognition; recurrence quantification analysis; nonlinear dynamics; recurrence plots; FEATURES; FRAMEWORK;
D O I
10.21437/Interspeech.2018-1377
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of its respective Recurrence Plot (RP) reveals complex structures which can be measured by performing Recurrence Quantification Analysis (RQA). These measures are aggregated by using statistical functionals over segment and utterance periods. We report SER results for the proposed feature set on three databases using different classification methods. When fusing the proposed features with traditional feature sets, e.g., [1], we show an improvement in unweighted accuracy of up to 5.7% and 10.7% on Speaker-Dependent (SD) and Speaker Independent (SI) SER tasks, respectively, over the baseline [1]. Following a segment-based approach we demonstrate state-of-the-art performance on IEMOCAP using a Bidirectional Recurrent Neural Network.
引用
收藏
页码:927 / 931
页数:5
相关论文
共 50 条
  • [1] Integrating Language and Emotion Features for Multilingual Speech Emotion Recognition
    Heracleous, Panikos
    Mohammad, Yasser
    Yoneyama, Akio
    [J]. HUMAN-COMPUTER INTERACTION. MULTIMODAL AND NATURAL INTERACTION, HCI 2020, PT II, 2020, 12182 : 187 - 196
  • [2] A multimodal emotion recognition model integrating speech, video and MoCAP
    Ning Jia
    Chunjun Zheng
    Wei Sun
    [J]. Multimedia Tools and Applications, 2022, 81 : 32265 - 32286
  • [3] A multimodal emotion recognition model integrating speech, video and MoCAP
    Jia, Ning
    Zheng, Chunjun
    Sun, Wei
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (22) : 32265 - 32286
  • [4] Continuous Emotion Recognition in Speech - Do We Need Recurrence?
    Schmitt, Maximilian
    Cummins, Nicholas
    Schuller, Bjoern
    [J]. INTERSPEECH 2019, 2019, : 2808 - 2812
  • [5] Speech emotion recognition using nonlinear dynamics features
    Shahzadi, Ali
    Ahmadyfard, Alireza
    Harimi, Ali
    Yaghmaie, Khashayar
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2015, 23 : 2056 - 2073
  • [6] Two-level discriminative speech emotion recognition model with wave field dynamics: A personalized speech emotion recognition method
    Jia, Ning
    Zheng, Chunjun
    [J]. COMPUTER COMMUNICATIONS, 2021, 180 : 161 - 170
  • [7] Speech Emotion Recognition
    Lalitha, S.
    Madhavan, Abhishek
    Bhushan, Bharath
    Saketh, Srinivas
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
  • [8] Recognition of Emotion Using Non-Linear Dynamics of Speech
    Harimi, Ali
    Shalizadi, Ali
    Ahmadyfard, Alireza
    [J]. 2014 7th International Symposium on Telecommunications (IST), 2014, : 446 - 451
  • [9] ANGER OR JOY? EMOTION RECOGNITION USING NONLINEAR DYNAMICS OF SPEECH
    Harimi, Ali
    AhmadyFard, Alireza
    Shahzadi, Ali
    Yaghmaie, Khashayar
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2015, 29 (07) : 675 - 696
  • [10] Emotion Recognition During Speech Using Dynamics of Multiple Regions of the Face
    Kim, Yelin
    Provost, Emily Mower
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2015, 12 (01)