High-Level Geometry-based Features of Video Modality for Emotion Prediction

被引:13
|
作者
Weber, Raphael [1 ]
Barrielle, Vincent [2 ]
Soladie, Catherine [1 ]
Seguier, Renaud [1 ]
机构
[1] IETR, FAST, CentraleSupelec, Ave Boulaie, F-35576 Cesson Sevigne, France
[2] Dynamixyz, 80 Ave Buttes de Coesmes, F-35700 Rennes, France
关键词
HEAD POSE ESTIMATION; FACIAL EXPRESSIONS; REPRESENTATION;
D O I
10.1145/2988257.2988262
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The automatic analysis of emotion remains a challenging task in unconstrained experimental conditions. In this paper, we present our contribution to the 6th Audio/Visual Emotion Challenge (AVEC 2016), which aims at predicting the continuous emotional dimensions of arousal and valence. First, we propose to improve the performance of the multi-modal prediction with low-level features by adding high-level geometry-based features, namely head pose and expression signature. The head pose is estimated by fitting a reference 3D mesh to the 2D facial landmarks. The expression signature is the projection of the facial landmarks in an unsupervised person-specific model. Second, we propose to fuse the unimodal predictions trained on each training subject before performing the multimodal fusion. The results show that our high-level features improve the performance of the multi-modal prediction of arousal and that the subjects fusion works well in unimodal prediction but generalizes poorly in multimodal prediction, particularly on valence.
引用
收藏
页码:51 / 58
页数:8
相关论文
共 50 条
  • [41] PREDICTION OF HIGH-LEVEL ROWING ABILITY
    WILLIAMS, LRT
    JOURNAL OF SPORTS MEDICINE AND PHYSICAL FITNESS, 1978, 18 (01): : 11 - 17
  • [42] High-level context representation for emotion recognition in images
    Costa, Willams de Lima
    Martinez, Estefania Talavera
    Figueiredo, Lucas Silva
    Teichrieb, Veronica
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW, 2023, : 326 - 334
  • [43] High-Level Libraries for Emotion Recognition in Music: A Review
    Ospitia Medina, Yesid
    Baldassarri, Sandra
    Ramon Beltran, Jose
    HUMAN-COMPUTER INTERACTION, HCI-COLLAB 2018, 2019, 847 : 158 - 168
  • [44] Predicting cognitive load with EEG using Riemannian geometry-based features
    Kremer, Iris
    Halimi, Wissam
    Walshe, Andy
    Cerf, Moran
    Mainar, Pablo
    JOURNAL OF NEURAL ENGINEERING, 2024, 21 (05)
  • [45] Exploiting word-level features for emotion prediction
    Nicholas, Greg
    Rotaru, Mihai
    Litman, Diane J.
    2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, : 110 - +
  • [46] Automatic generation of soccer video content hierarchy by mapping low-level features to high-level semantics
    Chen, JY
    Li, YH
    Lao, SY
    Wu, LD
    THIRD INTERNATIONAL SYMPOSIUM ON MULTISPECTRAL IMAGE PROCESSING AND PATTERN RECOGNITION, PTS 1 AND 2, 2003, 5286 : 541 - 546
  • [47] Kernel alignment maximization for speaker recognition based on high-level features
    Drgas, Szymon
    Dabrowski, Adam
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 496 - 499
  • [48] Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection
    Min, Hyun-seok
    Choi, Jae Young
    De Neve, Wesley
    Ro, Yong Man
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2011, 26 (10) : 612 - 627
  • [49] A symplectic geometry-based method for nonlinear time series decomposition and prediction
    Xie, Hong-Bo
    Dokos, Socrates
    APPLIED PHYSICS LETTERS, 2013, 103 (05)
  • [50] LIGHT-FIELD VIDEO CODING USING GEOMETRY-BASED DISPARITY COMPENSATION
    Conti, Caroline
    Kovacs, Peter Tamas
    Balogh, Tibor
    Nunes, Paulo
    Soares, Luis Ducla
    2014 3DTV-CONFERENCE: THE TRUE VISION - CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTV-CON), 2014,