High-Level Geometry-based Features of Video Modality for Emotion Prediction

被引:13
|
作者
Weber, Raphael [1 ]
Barrielle, Vincent [2 ]
Soladie, Catherine [1 ]
Seguier, Renaud [1 ]
机构
[1] IETR, FAST, CentraleSupelec, Ave Boulaie, F-35576 Cesson Sevigne, France
[2] Dynamixyz, 80 Ave Buttes de Coesmes, F-35700 Rennes, France
关键词
HEAD POSE ESTIMATION; FACIAL EXPRESSIONS; REPRESENTATION;
D O I
10.1145/2988257.2988262
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The automatic analysis of emotion remains a challenging task in unconstrained experimental conditions. In this paper, we present our contribution to the 6th Audio/Visual Emotion Challenge (AVEC 2016), which aims at predicting the continuous emotional dimensions of arousal and valence. First, we propose to improve the performance of the multi-modal prediction with low-level features by adding high-level geometry-based features, namely head pose and expression signature. The head pose is estimated by fitting a reference 3D mesh to the 2D facial landmarks. The expression signature is the projection of the facial landmarks in an unsupervised person-specific model. Second, we propose to fuse the unimodal predictions trained on each training subject before performing the multimodal fusion. The results show that our high-level features improve the performance of the multi-modal prediction of arousal and that the subjects fusion works well in unimodal prediction but generalizes poorly in multimodal prediction, particularly on valence.
引用
收藏
页码:51 / 58
页数:8
相关论文
共 50 条
  • [1] Geometry-Based Next Frame Prediction from Monocular Video
    Mahjourian, Reza
    Wicke, Martin
    Angelova, Anelia
    2017 28TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV 2017), 2017, : 1700 - 1707
  • [2] ADAPTIVE GEOMETRY-BASED INTRA PREDICTION FOR DEPTH VIDEO CODING
    Kang, Min-Koo
    Lee, Cheon
    Lee, Jin Young
    Ho, Yo-Sung
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 1230 - 1235
  • [3] The biosecure geometry-based system for hand modality
    Fouquier, Geoffroy
    Likforman, Laurence
    Darbon, Jerome
    Sankur, Buelent
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 801 - +
  • [4] Scanpath Prediction Based on High-Level Features and Memory Bias
    Shao, Xuan
    Luo, Ye
    Zhu, Dandan
    Li, Shuqin
    Itti, Laurent
    Lu, Jianwei
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 3 - 13
  • [5] Analysis of High-level Features for Vocal Emotion Recognition
    Atassi, Hicham
    Esposito, Anna
    Smekal, Zdenek
    2011 34TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2011, : 361 - 366
  • [6] Word-Level Emotion Recognition Using High-Level Features
    Moore, Johanna D.
    Tian, Leimin
    Lai, Catherine
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, CICLING 2014, PART II, 2014, 8404 : 17 - 31
  • [7] Using high-level semantic features in video retrieval
    Zheng, Wujie
    Li, Jianmin
    Si, Zhangzhang
    Lin, Fuzong
    Zhang, Bo
    IMAGE AND VIDEO RETRIEVAL, PROCEEDINGS, 2006, 4071 : 370 - 379
  • [8] MULTIMODAL EMOTION RECOGNITION WITH HIGH-LEVEL SPEECH AND TEXT FEATURES
    Makiuchi, Mariana Rodrigues
    Uto, Kuniaki
    Shinoda, Koichi
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 350 - 357
  • [9] Geometry-based Block Partitioning for Efficient Intra Prediction in Depth Video Coding
    Kang, Min-Koo
    Lee, Jaejoon
    Lee, Jin Young
    Ho, Yo-Sung
    VISUAL INFORMATION PROCESSING AND COMMUNICATION, 2010, 7543
  • [10] Emotion as High-level Perception
    Brandon Yip
    Synthese, 2021, 199 : 7181 - 7201