Exploring Multimodal Visual Features for Continuous Affect Recognition

被引：23

作者：

Sun, Bo ^{[1
]}

Cao, Siming ^{[1
]}

Li, Liandong ^{[1
]}

He, Jun ^{[1
]}

Yu, Lejun ^{[1
]}

机构：

[1] Beijing Normal Univ, Coll Informat Sci & Technol, Beijing 100875, Peoples R China

来源：

PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年

关键词：

Continuous Emotion Recognition; CNN; Multimodal Features; SVR; Residual Network;

D O I：

10.1145/2988257.2988270

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents our work in the Emotion Sub-Challenge of the 6th Audio/Visual Emotion Challenge and Workshop (AVEC 2016), whose goal is to explore utilizing audio, visual and physiological signals to continuously predict the value of the emotion dimensions (arousal and valence). As visual features are very important in emotion recognition, we try a variety of handcrafted and deep visual features. For each video clip, besides the baseline features, we extract multi-scale Dense SIFT features (MSDF), and some types of Convolutional neural networks (CNNs) features to recognize the expression phases of the current frame. We train linear Support Vector Regression (SVR) for every kind of features on the RECOLA dataset. Multimodal fusion of these modalities is then performed with a multiple linear regression model. The final Concordance Correlation Coefficient (CCC) we gained on the development set are 0.824 for arousal, and 0.718 for valence; and on the test set are 0.683 for arousal and 0.642 for valence.

引用

页码：83 / 88

页数：6

共 50 条

[21] Audio-visual continuous speech recognition using mpeg-4 compliant visual features
Aleksic, PS
Williams, JJ
Wu, ZL
Katsaggelos, AK
2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 960 - 963
[22] Multimodal Features and Accurate Place Recognition With Robust Optimization for Lidar-Visual-Inertial SLAM
Zhao, Xiongwei
Wen, Congcong
Manoj Prakhya, Sai
Yin, Hongpei
Zhou, Rundong
Sun, Yijiao
Xu, Jie
Bai, Haojie
Wang, Yang
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 1
[23] Multimodal Affect Recognition in Virtual Worlds: Avatars Mirroring Users' Affect
Gonzalez-Sanchez, Javier
Chavez-Echeagaray, Maria Elena
Gibson, David
Atkinson, Robert
2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, : 724 - +
[24] Exploring Multimodal Video Representation for Action Recognition
Wang, Cheng
Yang, Haojin
Meinel, Christoph
2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1924 - 1931
[25] ALPHABET OF FEATURES IN VISUAL RECOGNITION
DAVYDOVA, KN
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 39 - 40
[26] Exploring Fusion Strategies in Deep Multimodal Affect Prediction
Patania, Sabrina
D'Amelio, Alessandro
Lanzarotti, Raffaella
IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 730 - 741
[27] Audio-visual affect recognition
Zeng, Zhihong
Tu, Jilin
Liu, Ming
Huang, Thomas S.
Pianfetti, Brian
Roth, Dan
Levinson, Stephen
IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) : 424 - 428
[28] MULTIMODAL TRANSFORMER FUSION FOR CONTINUOUS EMOTION RECOGNITION
Huang, Jian
Tao, Jianhua
Liu, Bin
Lian, Zheng
Niu, Mingyue
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3507 - 3511
[29] Multimodal Prompting with Missing Modalities for Visual Recognition
Lee, Yi-Lun
Tsai, Yi-Hsuan
Chiu, Wei-Chen
Lee, Chen-Yu
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14943 - 14952
[30] MULTIMODAL AFFECT MODELING AND RECOGNITION FOR EMPATHIC ROBOT COMPANIONS
Castellano, Ginevra
Leite, Iolanda
Pereira, Andre
Martinho, Carlos
Paiva, Ana
McOwan, Peter W.
INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2013, 10 (01)

← 1 2 3 4 5 →