Multimodal Emotion Recognition in Response to Videos

被引:457
|
作者
Soleymani, Mohammad [1 ]
Pantic, Maja [2 ,3 ]
Pun, Thierry [1 ]
机构
[1] Univ Geneva, Dept Comp Sci, Comp Vis & Multimedia Lab, CH-1227 Carouge, GE, Switzerland
[2] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2AZ, England
[3] Univ Twente, Fac Elect Engn Math & Comp Sci, NL-7522 NB Enschede, Netherlands
基金
欧洲研究理事会; 瑞士国家科学基金会;
关键词
Emotion recognition; EEG; pupillary reflex; pattern classification; affective computing; PUPIL LIGHT REFLEX; CLASSIFICATION; OSCILLATIONS; SYSTEMS;
D O I
10.1109/T-AFFC.2011.37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a user-independent emotion recognition method with the goal of recovering affective tags for videos using electroencephalogram (EEG), pupillary response and gaze distance. We first selected 20 video clips with extrinsic emotional content from movies and online resources. Then, EEG responses and eye gaze data were recorded from 24 participants while watching emotional video clips. Ground truth was defined based on the median arousal and valence scores given to clips in a preliminary study using an online questionnaire. Based on the participants' responses, three classes for each dimension were defined. The arousal classes were calm, medium aroused, and activated and the valence classes were unpleasant, neutral, and pleasant. One of the three affective labels of either valence or arousal was determined by classification of bodily responses. A one-participant-out cross validation was employed to investigate the classification performance in a user-independent approach. The best classification accuracies of 68.5 percent for three labels of valence and 76.4 percent for three labels of arousal were obtained using a modality fusion strategy and a support vector machine. The results over a population of 24 participants demonstrate that user-independent emotion recognition can outperform individual self-reports for arousal assessments and do not underperform for valence assessments.
引用
收藏
页码:211 / 223
页数:13
相关论文
共 50 条
  • [21] Emotion Recognition using Multimodal Features
    Zhao, Jinming
    Chen, Shizhe
    Wang, Shuai
    Jin, Qin
    2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
  • [22] A Multimodal Dataset for Mixed Emotion Recognition
    Yang, Pei
    Liu, Niqi
    Liu, Xinge
    Shu, Yezhi
    Ji, Wenqi
    Ren, Ziqi
    Sheng, Jenny
    Yu, Minjing
    Yi, Ran
    Zhang, Dan
    Liu, Yong-Jin
    SCIENTIFIC DATA, 2024, 11 (01)
  • [23] Multimodal approaches for emotion recognition: A survey
    Sebe, N
    Cohen, I
    Gevers, T
    Huang, TS
    INTERNET IMAGING VI, 2005, 5670 : 56 - 67
  • [24] Multimodal Emotion Recognition Based on the Decoupling of Emotion and Speaker Information
    Gajsek, Rok
    Struc, Vitomir
    Mihelic, France
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 275 - 282
  • [25] Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition
    Liu, Wei
    Qiu, Jie-Lin
    Zheng, Wei-Long
    Lu, Bao-Liang
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 715 - 729
  • [26] Socializing the Videos: A Multimodal Approach for Social Relation Recognition
    Xu, Tong
    Zhou, Peilun
    Hu, Linkang
    He, Xiangnan
    Hu, Yao
    Chen, Enhong
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
  • [27] Emotion Recognition from Videos Using Facial Expressions
    Selvi, P. Tamil
    Vyshnavi, P.
    Jagadish, R.
    Srikumar, Shravan
    Veni, S.
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 565 - 576
  • [28] Multimodal Multipart Learning for Action Recognition in Depth Videos
    Shahroudy, Amir
    Ng, Tian-Tsong
    Yang, Qingxiong
    Wang, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 2123 - 2129
  • [29] Emotion Recognition in the Wild from Videos using Images
    Bargal, Sarah Adel
    Barsoum, Emad
    Ferrer, Cristian Canton
    Zhang, Cha
    ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 433 - 436
  • [30] TRANSFORMER BASED MULTIMODAL SCENE RECOGNITION IN SOCCER VIDEOS
    Gan, Yaozong
    Togo, Ren
    Ogawa, Takahiro
    Haseyama, Miki
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,