Multiroom Speech Emotion Recognition

被引:0
|
作者
Shalev, Erez [1 ]
Cohen, Israel [1 ]
机构
[1] Technion Israel Inst Technol, Andrew & Erna Viterbi Fac Elect & Comp Engn, IL-3200003 Haifa, Israel
关键词
Emotion recognition; acoustics; room impulse response; multiroom; augmentation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Automated audio systems, such as speech emotion recognition, can benefit from the ability to work from another room. No research has yet been conducted on the effectiveness of such systems when the sound source originates in a different room than the target system, and the sound has to travel between the rooms through the wall. New advancements in room-impulse-response generators enable a large-scale simulation of audio sources from adjacent rooms and integration into a training dataset. Such a capability improves the performance of data-driven methods such as deep learning. This paper presents the first evaluation of multiroom speech emotion recognition systems. The isolating policies due to COVID-19 presented many cases of isolated individuals suffering emotional difficulties, where such capabilities would be very beneficial. We perform training, with and without an audio simulation generator, and compare the results of three different models on real data recorded in a real multiroom audio scene. We show that models trained without the new generator achieve poor results when presented with multiroom data. We proceed to show that augmentation using the new generator improves the performances for all three models. Our results demonstrate the advantage of using such a generator. Furthermore, testing with two different deep learning architectures shows that the generator improves the results independently of the given architecture.
引用
收藏
页码:135 / 139
页数:5
相关论文
共 50 条
  • [1] Multiroom Speech Emotion Recognition
    Shalev, Erez
    Cohen, Israel
    [J]. European Signal Processing Conference, 2022, 2022-August : 135 - 139
  • [2] Speech Emotion Recognition
    Lalitha, S.
    Madhavan, Abhishek
    Bhushan, Bharath
    Saketh, Srinivas
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRONICS, COMPUTERS AND COMMUNICATIONS (ICAECC), 2014,
  • [3] Speech emotion recognition based on emotion perception
    Gang Liu
    Shifang Cai
    Ce Wang
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [4] English speech emotion recognition method based on speech recognition
    Man Liu
    [J]. International Journal of Speech Technology, 2022, 25 : 391 - 398
  • [5] Speech emotion recognition based on emotion perception
    Liu, Gang
    Cai, Shifang
    Wang, Ce
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [6] Autoencoder With Emotion Embedding for Speech Emotion Recognition
    Zhang, Chenghao
    Xue, Lei
    [J]. IEEE ACCESS, 2021, 9 : 51231 - 51241
  • [7] English speech emotion recognition method based on speech recognition
    Liu, Man
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (2) : 391 - 398
  • [8] Autoencoder with emotion embedding for speech emotion recognition
    Zhang, Chenghao
    Xue, Lei
    [J]. IEEE Access, 2021, 9 : 51231 - 51241
  • [9] Windowing for Speech Emotion Recognition
    Puterka, Boris
    Kacur, Juraj
    Pavlovicova, Jarmila
    [J]. 2019 61ST INTERNATIONAL SYMPOSIUM ELMAR, 2019, : 147 - 150
  • [10] Mandarin emotion recognition in speech
    Pao, TL
    Chen, YT
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 227 - 230