High-Level Geometry-based Features of Video Modality for Emotion Prediction

被引：13

作者：

Weber, Raphael ^{[1
]}

Barrielle, Vincent ^{[2
]}

Soladie, Catherine ^{[1
]}

Seguier, Renaud ^{[1
]}

机构：

[1] IETR, FAST, CentraleSupelec, Ave Boulaie, F-35576 Cesson Sevigne, France

[2] Dynamixyz, 80 Ave Buttes de Coesmes, F-35700 Rennes, France

来源：

PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年

关键词：

HEAD POSE ESTIMATION; FACIAL EXPRESSIONS; REPRESENTATION;

D O I：

10.1145/2988257.2988262

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The automatic analysis of emotion remains a challenging task in unconstrained experimental conditions. In this paper, we present our contribution to the 6th Audio/Visual Emotion Challenge (AVEC 2016), which aims at predicting the continuous emotional dimensions of arousal and valence. First, we propose to improve the performance of the multi-modal prediction with low-level features by adding high-level geometry-based features, namely head pose and expression signature. The head pose is estimated by fitting a reference 3D mesh to the 2D facial landmarks. The expression signature is the projection of the facial landmarks in an unsupervised person-specific model. Second, we propose to fuse the unimodal predictions trained on each training subject before performing the multimodal fusion. The results show that our high-level features improve the performance of the multi-modal prediction of arousal and that the subjects fusion works well in unimodal prediction but generalizes poorly in multimodal prediction, particularly on valence.

引用

页码：51 / 58

页数：8

共 50 条

[31] Delphi: geometry-based connectivity prediction in triangle mesh compression
Volker Coors
Jarek Rossignac
The Visual Computer, 2004, 20 : 507 - 520
[32] Delphi: geometry-based connectivity prediction in triangle mesh compression
Coors, V
Rossignac, J
VISUAL COMPUTER, 2004, 20 (8-9): : 507 - 520
[33] FEATURES OF HIGH-LEVEL LANGUAGES FOR MICROPROCESSORS
DAVIES, AC
MICROPROCESSORS AND MICROSYSTEMS, 1987, 11 (02) : 77 - 87
[34] Extracting High-level Multimodal Features
Li, Xin
Liu, Ruifang
PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INDUSTRIAL INFORMATICS, 2015, 31 : 605 - 610
[35] Automatic Topic Segmentation for Video Lectures Using Low and High-Level Audio Features
Soares, Eduardo R.
Barrere, Eduardo
WEBMEDIA'18: PROCEEDINGS OF THE 24TH BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2018, : 189 - 196
[36] A Geometry-based Method for Prediction of Shot-peening Coverage
Kim, Hong Seok
Lee, Yong-Sung
Shin, Ki-Hoon
PROCEEDINGS OF PRECISION ENGINEERING AND NANOTECHNOLOGY (ASPEN2011), 2012, 516 : 527 - +
[37] High-Level Codewords Based on Granger Causality for Video Event Detection
Huang, Shao-nian
Huang, Dong-jun
Khuhro, Mansoor Ahmed
ADVANCES IN MULTIMEDIA, 2015, 2015
[38] A high-level pipelined FPGA based DCT for video coding application
Reddy, VSK
Sengupta, S
Iatha, YM
IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 561 - 565
[39] Personality and emotion-based high-level control of affective story characters
Su, Wen-Poh
Pham, Binh
Wardhani, Aster
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2007, 13 (02) : 281 - 293
[40] Extraction of high-level video content for advanced video applications
Amer, A
CCECE 2003: CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-3, PROCEEDINGS: TOWARD A CARING AND HUMANE TECHNOLOGY, 2003, : 1171 - 1174

← 1 2 3 4 5 →