Semantic audio-visual data fusion for automatic emotion recognition

被引:0
|
作者
Datcu, Dragos [1 ]
Rothkrantz, Leon J. M. [1 ]
机构
[1] Delft Univ Technol, Man Machine Interact Grp, NL-2628 CD Delft, Netherlands
来源
关键词
data fusion; automatic emotion recognition; speech analysis; face detection; facial feature extraction; facial characteristic point extraction; Active Appearance Models; support vector machines;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The paper describes a novel technique for the recognition of emotions from multimodal data. We focus on the recognition of the six prototypic emotions. The results from the facial expression recognition and from the emotion recognition from speech are combined using a bi-modal multimodal semantic data fusion model that determines the most probable emotion of the subject. Two types of models based on geometric face features for facial expression recognition are being used, depending on the presence or absence of speech. In our approach we define an algorithm that is robust to changes of face shape that occur during regular speech. The influence of phoneme generation on the face shape during speech is removed by using features that are only related to the eyes and the eyebrows. The paper includes results from testing the presented models.
引用
收藏
页码:58 / 65
页数:8
相关论文
共 50 条
  • [41] Fusion of deep learning features with mixture of brain emotional learning for audio-visual emotion recognition
    Farhoudi, Zeinab
    Setayeshi, Saeed
    SPEECH COMMUNICATION, 2021, 127 : 92 - 103
  • [42] An Active Learning Paradigm for Online Audio-Visual Emotion Recognition
    Kansizoglou, Ioannis
    Bampis, Loukas
    Gasteratos, Antonios
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (02) : 756 - 768
  • [43] Audio-Visual Emotion Recognition Using Big Data Towards 5G
    M. Shamim Hossain
    Ghulam Muhammad
    Mohammed F. Alhamid
    Biao Song
    Khaled Al-Mutib
    Mobile Networks and Applications, 2016, 21 : 753 - 763
  • [44] Audio-Visual Emotion Recognition Using Big Data Towards 5G
    Hossain, M. Shamim
    Muhammad, Ghulam
    Alhamid, Mohammed F.
    Song, Biao
    Al-Mutib, Khaled
    MOBILE NETWORKS & APPLICATIONS, 2016, 21 (05): : 753 - 763
  • [45] MANDARIN AUDIO-VISUAL SPEECH RECOGNITION WITH EFFECTS TO THE NOISE AND EMOTION
    Pao, Tsang-Long
    Liao, Wen-Yuan
    Chen, Yu-Te
    Wu, Tsan-Nung
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2010, 6 (02): : 711 - 723
  • [46] Multimodal and Temporal Perception of Audio-visual Cues for Emotion Recognition
    Ghaleb, Esam
    Popa, Mirela
    Asteriadis, Stylianos
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [47] Multimodal Emotion Recognition using Physiological and Audio-Visual Features
    Matsuda, Yuki
    Fedotov, Dmitrii
    Takahashi, Yuta
    Arakawa, Yutaka
    Yasumo, Keiichi
    Minker, Wolfgang
    PROCEEDINGS OF THE 2018 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2018 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (UBICOMP/ISWC'18 ADJUNCT), 2018, : 946 - 951
  • [48] A PRE-TRAINED AUDIO-VISUAL TRANSFORMER FOR EMOTION RECOGNITION
    Minh Tran
    Soleymani, Mohammad
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4698 - 4702
  • [49] DISENTANGLEMENT FOR AUDIO-VISUAL EMOTION RECOGNITION USING MULTITASK SETUP
    Peri, Raghuveer
    Parthasarathy, Srinivas
    Bradshaw, Charles
    Sundaram, Shiva
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6344 - 6348
  • [50] ISLA: Temporal Segmentation and Labeling for Audio-Visual Emotion Recognition
    Kim, Yelin
    Provost, Emily Mower
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (02) : 196 - 208