Semantic audio-visual data fusion for automatic emotion recognition

被引:0
|
作者
Datcu, Dragos [1 ]
Rothkrantz, Leon J. M. [1 ]
机构
[1] Delft Univ Technol, Man Machine Interact Grp, NL-2628 CD Delft, Netherlands
来源
关键词
data fusion; automatic emotion recognition; speech analysis; face detection; facial feature extraction; facial characteristic point extraction; Active Appearance Models; support vector machines;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper describes a novel technique for the recognition of emotions from multimodal data. We focus on the recognition of the six prototypic emotions. The results from the facial expression recognition and from the emotion recognition from speech are combined using a bi-modal multimodal semantic data fusion model that determines the most probable emotion of the subject. Two types of models based on geometric face features for facial expression recognition are being used, depending on the presence or absence of speech. In our approach we define an algorithm that is robust to changes of face shape that occur during regular speech. The influence of phoneme generation on the face shape during speech is removed by using features that are only related to the eyes and the eyebrows. The paper includes results from testing the presented models.
引用
收藏
页码:58 / 65
页数:8
相关论文
共 50 条
  • [21] AUDIO-VISUAL EMOTION RECOGNITION USING BOLTZMANN ZIPPERS
    Lu, Kun
    Jia, Yunde
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 2589 - 2592
  • [22] Audio-visual emotion recognition with multilayer boosted HMM
    Lü, Kun
    Jia, Yun-De
    Zhang, Xin
    Lü, K. (kunlv@bit.edu.cn), 1600, Beijing Institute of Technology (22): : 89 - 93
  • [23] Deep emotion recognition based on audio-visual correlation
    Hajarolasvadi, Noushin
    Demirel, Hasan
    IET COMPUTER VISION, 2020, 14 (07) : 517 - 527
  • [24] A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition
    Praveen, R. Gnana
    de Melo, Wheidima Carneiro
    Ullah, Nasib
    Aslam, Haseeb
    Zeeshan, Osama
    Denorme, Theo
    Pedersoli, Marco
    Koerich, Alessandro L.
    Bacon, Simon
    Cardinal, Patrick
    Granger, Eric
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2485 - 2494
  • [25] Audio-visual emotion recognition with multilayer boosted HMM
    吕坤
    贾云得
    张欣
    JournalofBeijingInstituteofTechnology, 2013, 22 (01) : 89 - 93
  • [26] Data Augmentation for Audio-Visual Emotion Recognition with an Efficient Multimodal Conditional GAN
    Ma, Fei
    Li, Yang
    Ni, Shiguang
    Huang, Shao-Lun
    Zhang, Lin
    APPLIED SCIENCES-BASEL, 2022, 12 (01):
  • [27] Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes
    Ito, Koichiro
    Fujioka, Takuya
    Sun, Qinghua
    Nagamatsu, Kenji
    INTERSPEECH 2021, 2021, : 4493 - 4497
  • [28] Audio-visual feature fusion via deep neural networks for automatic speech recognition
    Rahmani, Mohammad Hasan
    Almasganj, Farshad
    Seyyedsalehi, Seyyed Ali
    DIGITAL SIGNAL PROCESSING, 2018, 82 : 54 - 63
  • [29] Audio-Visual Automatic Speech Recognition for Connected Digits
    Wang, Xiaoping
    Hao, Yufeng
    Fu, Degang
    Yuan, Chunwei
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL III, PROCEEDINGS, 2008, : 328 - +
  • [30] Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes
    Korshunov, Pavel
    Chen, Haolin
    Garner, Philip N.
    Marcel, Sebastien
    2023 IEEE INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS, IJCB, 2023,