Exploring Multimodal Visual Features for Continuous Affect Recognition

被引：23

作者：

Sun, Bo ^{[1
]}

Cao, Siming ^{[1
]}

Li, Liandong ^{[1
]}

He, Jun ^{[1
]}

Yu, Lejun ^{[1
]}

机构：

[1] Beijing Normal Univ, Coll Informat Sci & Technol, Beijing 100875, Peoples R China

来源：

PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年

关键词：

Continuous Emotion Recognition; CNN; Multimodal Features; SVR; Residual Network;

D O I：

10.1145/2988257.2988270

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents our work in the Emotion Sub-Challenge of the 6th Audio/Visual Emotion Challenge and Workshop (AVEC 2016), whose goal is to explore utilizing audio, visual and physiological signals to continuously predict the value of the emotion dimensions (arousal and valence). As visual features are very important in emotion recognition, we try a variety of handcrafted and deep visual features. For each video clip, besides the baseline features, we extract multi-scale Dense SIFT features (MSDF), and some types of Convolutional neural networks (CNNs) features to recognize the expression phases of the current frame. We train linear Support Vector Regression (SVR) for every kind of features on the RECOLA dataset. Multimodal fusion of these modalities is then performed with a multiple linear regression model. The final Concordance Correlation Coefficient (CCC) we gained on the development set are 0.824 for arousal, and 0.718 for valence; and on the test set are 0.683 for arousal and 0.642 for valence.

引用

页码：83 / 88

页数：6

共 50 条

[1] CONTINUOUS VISUAL SPEECH RECOGNITION FOR MULTIMODAL FUSION
Benhaim, Eric
Sahbi, Hichem
Vitte, Guillaume
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[2] Independent information from visual features for multimodal speech recognition
Gurbuz, S
Tufekci, Z
Patterson, E
Gowdy, JN
IEEE SOUTHEASTCON 2001: ENGINEERING THE FUTURE, PROCEEDINGS, 2001, : 221 - 228
[3] Multimodal Continuous Affect Recognition Based on LSTM and Multiple Kernel Learning
Wei, Jiamei
Pei, Ercheng
Jiang, Dongmei
Sahli, Hichem
Xie, Lei
Fu, Zhonghua
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[4] Multimodal Emotion Recognition using Physiological and Audio-Visual Features
Matsuda, Yuki
Fedotov, Dmitrii
Takahashi, Yuta
Arakawa, Yutaka
Yasumo, Keiichi
Minker, Wolfgang
PROCEEDINGS OF THE 2018 ACM INTERNATIONAL JOINT CONFERENCE ON PERVASIVE AND UBIQUITOUS COMPUTING AND PROCEEDINGS OF THE 2018 ACM INTERNATIONAL SYMPOSIUM ON WEARABLE COMPUTERS (UBICOMP/ISWC'18 ADJUNCT), 2018, : 946 - 951
[5] Improving Face Recognition by Exploring Local Features with Visual Attention
Shi, Yichun
Jain, Anil K.
2018 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2018, : 247 - 254
[6] Visual affect recognition
Stathopoulou I.-O.
Tsihrintzis G.A.
Frontiers in Artificial Intelligence and Applications, 2010, 214 : 1 - 267
[7] Fusion Mappings for Multimodal Affect Recognition
Kaechele, Markus
Schels, Martin
Thiam, Patrick
Schwenker, Friedhelm
2015 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2015, : 307 - 313
[8] Exploring Multimodal Features and Fusion for Time-Continuous Prediction of Emotional Valence and Arousal
Kumar, Ajit
Choi, Bong Jun
Pandey, Sandeep Kumar
Park, Sanghyeon
Choi, SeongIk
Shekhawat, Hanumant Singh
De Neve, Wesley
Saini, Mukesh
Prasanna, S. R. M.
Singh, Dhananjay
INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2021, 2022, 13184 : 729 - 744
[9] Monocular 3D Facial Expression Features for Continuous Affect Recognition
Pei, Ercheng
Oveneke, Meshia Cedric
Zhao, Yong
Jiang, Dongmei
Sahli, Hichem
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 3540 - 3550
[10] Multimodal continuous visual attention mechanisms
Farinhas, António
Martins, André F.T.
Aguiar, Pedro M.Q.
arXiv, 2021,

← 1 2 3 4 5 →