Multimodal Emotion Recognition in Response to Videos

被引：457

作者：

Soleymani, Mohammad ^{[1
]}

Pantic, Maja ^{[2
,3
]}

Pun, Thierry ^{[1
]}

机构：

[1] Univ Geneva, Dept Comp Sci, Comp Vis & Multimedia Lab, CH-1227 Carouge, GE, Switzerland

[2] Univ London Imperial Coll Sci Technol & Med, Dept Comp, London SW7 2AZ, England

[3] Univ Twente, Fac Elect Engn Math & Comp Sci, NL-7522 NB Enschede, Netherlands

来源：

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING | 2012年 / 3卷 / 02期

基金：

欧洲研究理事会; 瑞士国家科学基金会;

关键词：

Emotion recognition; EEG; pupillary reflex; pattern classification; affective computing; PUPIL LIGHT REFLEX; CLASSIFICATION; OSCILLATIONS; SYSTEMS;

D O I：

10.1109/T-AFFC.2011.37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a user-independent emotion recognition method with the goal of recovering affective tags for videos using electroencephalogram (EEG), pupillary response and gaze distance. We first selected 20 video clips with extrinsic emotional content from movies and online resources. Then, EEG responses and eye gaze data were recorded from 24 participants while watching emotional video clips. Ground truth was defined based on the median arousal and valence scores given to clips in a preliminary study using an online questionnaire. Based on the participants' responses, three classes for each dimension were defined. The arousal classes were calm, medium aroused, and activated and the valence classes were unpleasant, neutral, and pleasant. One of the three affective labels of either valence or arousal was determined by classification of bodily responses. A one-participant-out cross validation was employed to investigate the classification performance in a user-independent approach. The best classification accuracies of 68.5 percent for three labels of valence and 76.4 percent for three labels of arousal were obtained using a modality fusion strategy and a support vector machine. The results over a population of 24 participants demonstrate that user-independent emotion recognition can outperform individual self-reports for arousal assessments and do not underperform for valence assessments.

引用

页码：211 / 223

页数：13

共 50 条

[21] Emotion Recognition using Multimodal Features
Zhao, Jinming
Chen, Shizhe
Wang, Shuai
Jin, Qin
2018 FIRST ASIAN CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII ASIA), 2018,
[22] A Multimodal Dataset for Mixed Emotion Recognition
Yang, Pei
Liu, Niqi
Liu, Xinge
Shu, Yezhi
Ji, Wenqi
Ren, Ziqi
Sheng, Jenny
Yu, Minjing
Yi, Ran
Zhang, Dan
Liu, Yong-Jin
SCIENTIFIC DATA, 2024, 11 (01)
[23] Multimodal approaches for emotion recognition: A survey
Sebe, N
Cohen, I
Gevers, T
Huang, TS
INTERNET IMAGING VI, 2005, 5670 : 56 - 67
[24] Multimodal Emotion Recognition Based on the Decoupling of Emotion and Speaker Information
Gajsek, Rok
Struc, Vitomir
Mihelic, France
TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 275 - 282
[25] Comparing Recognition Performance and Robustness of Multimodal Deep Learning Models for Multimodal Emotion Recognition
Liu, Wei
Qiu, Jie-Lin
Zheng, Wei-Long
Lu, Bao-Liang
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (02) : 715 - 729
[26] Socializing the Videos: A Multimodal Approach for Social Relation Recognition
Xu, Tong
Zhou, Peilun
Hu, Linkang
He, Xiangnan
Hu, Yao
Chen, Enhong
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (01)
[27] Emotion Recognition from Videos Using Facial Expressions
Selvi, P. Tamil
Vyshnavi, P.
Jagadish, R.
Srikumar, Shravan
Veni, S.
ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 565 - 576
[28] Multimodal Multipart Learning for Action Recognition in Depth Videos
Shahroudy, Amir
Ng, Tian-Tsong
Yang, Qingxiong
Wang, Gang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 2123 - 2129
[29] Emotion Recognition in the Wild from Videos using Images
Bargal, Sarah Adel
Barsoum, Emad
Ferrer, Cristian Canton
Zhang, Cha
ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 433 - 436
[30] TRANSFORMER BASED MULTIMODAL SCENE RECOGNITION IN SOCCER VIDEOS
Gan, Yaozong
Togo, Ren
Ogawa, Takahiro
Haseyama, Miki
2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,

← 1 2 3 4 5 →