Exploring Multimodal Visual Features for Continuous Affect Recognition

被引:23
|
作者
Sun, Bo [1 ]
Cao, Siming [1 ]
Li, Liandong [1 ]
He, Jun [1 ]
Yu, Lejun [1 ]
机构
[1] Beijing Normal Univ, Coll Informat Sci & Technol, Beijing 100875, Peoples R China
来源
PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年
关键词
Continuous Emotion Recognition; CNN; Multimodal Features; SVR; Residual Network;
D O I
10.1145/2988257.2988270
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents our work in the Emotion Sub-Challenge of the 6th Audio/Visual Emotion Challenge and Workshop (AVEC 2016), whose goal is to explore utilizing audio, visual and physiological signals to continuously predict the value of the emotion dimensions (arousal and valence). As visual features are very important in emotion recognition, we try a variety of handcrafted and deep visual features. For each video clip, besides the baseline features, we extract multi-scale Dense SIFT features (MSDF), and some types of Convolutional neural networks (CNNs) features to recognize the expression phases of the current frame. We train linear Support Vector Regression (SVR) for every kind of features on the RECOLA dataset. Multimodal fusion of these modalities is then performed with a multiple linear regression model. The final Concordance Correlation Coefficient (CCC) we gained on the development set are 0.824 for arousal, and 0.718 for valence; and on the test set are 0.683 for arousal and 0.642 for valence.
引用
收藏
页码:83 / 88
页数:6
相关论文
共 50 条
  • [21] Audio-visual continuous speech recognition using mpeg-4 compliant visual features
    Aleksic, PS
    Williams, JJ
    Wu, ZL
    Katsaggelos, AK
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 960 - 963
  • [22] Multimodal Features and Accurate Place Recognition With Robust Optimization for Lidar-Visual-Inertial SLAM
    Zhao, Xiongwei
    Wen, Congcong
    Manoj Prakhya, Sai
    Yin, Hongpei
    Zhou, Rundong
    Sun, Yijiao
    Xu, Jie
    Bai, Haojie
    Wang, Yang
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 1
  • [23] Multimodal Affect Recognition in Virtual Worlds: Avatars Mirroring Users' Affect
    Gonzalez-Sanchez, Javier
    Chavez-Echeagaray, Maria Elena
    Gibson, David
    Atkinson, Robert
    2013 HUMAINE ASSOCIATION CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2013, : 724 - +
  • [24] Exploring Multimodal Video Representation for Action Recognition
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1924 - 1931
  • [25] ALPHABET OF FEATURES IN VISUAL RECOGNITION
    DAVYDOVA, KN
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 39 - 40
  • [26] Exploring Fusion Strategies in Deep Multimodal Affect Prediction
    Patania, Sabrina
    D'Amelio, Alessandro
    Lanzarotti, Raffaella
    IMAGE ANALYSIS AND PROCESSING, ICIAP 2022, PT II, 2022, 13232 : 730 - 741
  • [27] Audio-visual affect recognition
    Zeng, Zhihong
    Tu, Jilin
    Liu, Ming
    Huang, Thomas S.
    Pianfetti, Brian
    Roth, Dan
    Levinson, Stephen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) : 424 - 428
  • [28] MULTIMODAL TRANSFORMER FUSION FOR CONTINUOUS EMOTION RECOGNITION
    Huang, Jian
    Tao, Jianhua
    Liu, Bin
    Lian, Zheng
    Niu, Mingyue
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3507 - 3511
  • [29] Multimodal Prompting with Missing Modalities for Visual Recognition
    Lee, Yi-Lun
    Tsai, Yi-Hsuan
    Chiu, Wei-Chen
    Lee, Chen-Yu
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14943 - 14952
  • [30] MULTIMODAL AFFECT MODELING AND RECOGNITION FOR EMPATHIC ROBOT COMPANIONS
    Castellano, Ginevra
    Leite, Iolanda
    Pereira, Andre
    Martinho, Carlos
    Paiva, Ana
    McOwan, Peter W.
    INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2013, 10 (01)