The Influence of Blind Source Separation on Mixed Audio Speech and Music Emotion Recognition

被引:3
|
作者
Laugs, Casper [1 ]
Koops, Hendrik Vincent [2 ]
Odijk, Daan [2 ]
Kaya, Heysem [1 ]
Volk, Anja [1 ]
机构
[1] Univ Utrecht, Utrecht, Netherlands
[2] RTL Netherlands, Hilversum, Netherlands
关键词
Speech emotion recognition; music emotion recognition; blind source separation; multi-modal; FEATURES; MODELS;
D O I
10.1145/3395035.3425252
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While both speech emotion recognition and music emotion recognition have been studied extensively in different communities, little research went into the recognition of emotion from mixed audio sources, i.e. when both speech and music are present. However, many application scenarios require models that are able to extract emotions from mixed audio sources, such as television content. This paper studies how mixed audio affects both speech and music emotion recognition using a random forest and deep neural network model, and investigates if blind source separation of the mixed signal beforehand is beneficial. We created a mixed audio dataset, with 25% speech-music overlap without contextual relationship between the two. We show that specialized models for speech-only or music-only audio were able to achieve merely 'chance-level' performance on mixed audio. For speech, above chance-level performance was achieved when trained on raw mixed audio, but optimal performance was achieved with audio blind source separated beforehand. Music emotion recognition models on mixed audio achieve performance approaching or even surpassing performance on music-only audio, with and without blind source separation. Our results are important for estimating emotion from real-world data, where individual speech and music tracks are often not available.
引用
收藏
页码:67 / 71
页数:5
相关论文
共 50 条
  • [1] Speech Recognition Using Blind Source Separation and Dereverberation Method for Mixed Sound of Speech and Music
    Wang, Longbiao
    Odani, Kyohei
    Kai, Atsuhiko
    Li, Weifeng
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [2] INFLUENCE OF AUDIO BANDWIDTH ON SPEECH EMOTION RECOGNITION BY HUMAN SUBJECTS
    Lahaie, Olivier
    Lefebvre, Roch
    Gournay, Philippe
    [J]. 2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 61 - 65
  • [3] Blind Source Separation of Noisy Mixed Speech Signals
    Li, Huiya
    Shi, Jianying
    Men, Jinxi
    [J]. SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS II, PTS 1 AND 2, 2014, 475-476 : 291 - +
  • [4] Improved speech emotion recognition based on music-related audio features
    Vu, Linh
    Phan, Raphael C-W
    Han, Lim Wern
    Phung, Dinh
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 120 - 124
  • [5] Blind source separation of speech and acoustic signal for moving audio sources
    Shibata, T
    Nagasaka, K
    [J]. 8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VI, PROCEEDINGS: IMAGE, ACOUSTIC, SIGNAL PROCESSING AND OPTICAL SYSTEMS, TECHNOLOGIES AND APPLICATIONS, 2004, : 455 - 459
  • [6] Experiments in audio source separation with one sensor for robust speech recognition
    Benaroya, Laurent
    Bimbot, Frederic
    Gravier, Guillaume
    Gribonval, Remi
    [J]. SPEECH COMMUNICATION, 2006, 48 (07) : 848 - 854
  • [7] Perceptual evaluation of blind source separation for robust speech recognition
    Di Persia, Leandro
    Milone, Diego
    Rufiner, Hugo Leonardo
    Yanagida, Masuzo
    [J]. SIGNAL PROCESSING, 2008, 88 (10) : 2578 - 2583
  • [8] Speech Emotion Recognition Using Audio Matching
    Chaturvedi, Iti
    Noel, Tim
    Satapathy, Ranjan
    [J]. ELECTRONICS, 2022, 11 (23)
  • [9] Novel Audio Features for Music Emotion Recognition
    Panda, Renato
    Malheiro, Ricardo
    Paiva, Rui Pedro
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2020, 11 (04) : 614 - 626
  • [10] Audio Features for Music Emotion Recognition: A Survey
    Panda, Renato
    Malheiro, Ricardo
    Paiva, Rui Pedro
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (01) : 68 - 88