Deep Learning Based Video Spatio-Temporal Modeling for Emotion Recognition

被引:5
|
作者
Fonnegra, Ruben D. [1 ]
Diaz, Gloria M. [1 ]
机构
[1] Inst Tecnol Metropolitano, Medellin, Colombia
关键词
Deep learning; Facial emotion recognition; Spatio-temporal modeling; FACIAL EXPRESSION RECOGNITION; DESIGN; SYSTEM;
D O I
10.1007/978-3-319-91238-7_32
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Affective Computing is a growing research area, which aims to determine the emotional user states through their conscious and unconscious actions and use it to modify the machine interaction. This paper investigates the discriminative abilities of convolutional and recurrent neural networks to modeling spatio-temporal features from video sequences of the face region. In a deep learning architecture, dense convolutional layers are used for analyzing spatial information changes in frames during short time periods, while dense recurrent layers are used to model changes in frames as temporal sequences that change across the time. Those layers are then connected to a multilayer perceptron (MLP) to perform the classification task, which consists in to distinguish between six different emotion categories. The performance was twofold evaluated: gender independent and gender-dependent classifications. Experimental results show that the proposed approach achieves an accuracy of 81.84%, in the gender independent experiment, which outperforms previous works using the same experimental data. In the gender-dependent experiment, accuracy was 80.79% and 82.75% for male and female, respectively.
引用
收藏
页码:397 / 408
页数:12
相关论文
共 50 条
  • [41] Human Action Recognition by Learning Spatio-Temporal Features With Deep Neural Networks
    Wang, Lei
    Xu, Yangyang
    Cheng, Jun
    Xia, Haiying
    Yin, Jianqin
    Wu, Jiaji
    [J]. IEEE ACCESS, 2018, 6 : 17913 - 17922
  • [42] Video modeling by spatio-temporal resampling and Bayesian fusion
    Zheng, Yunfei
    Li, Xin
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 3201 - 3204
  • [43] Study of Spatio-Temporal Modeling in Video Quality Assessment
    Fang, Yuming
    Li, Zhaoqian
    Yan, Jiebin
    Sui, Xiangjie
    Liu, Hantao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2693 - 2702
  • [44] Facial Expression Recognition Based on the Fusion of Spatio-temporal Features in Video Sequences
    Wang Xiaohua
    Xia Chen
    Hu Min
    Ren Fuji
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2018, 40 (03) : 626 - 632
  • [45] Leveraging Transfer Learning for Spatio-Temporal Human Activity Recognition from Video Sequences
    Butt, Umair Muneer
    Ullah, Hadiqa Aman
    Letchmunan, Sukumar
    Tariq, Iqra
    Hassan, Fadratul Hafinaz
    Koh, Tieng Wei
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 5017 - 5033
  • [46] Human Activity Recognition Based on Transfer Learning with Spatio-Temporal Representations
    Zebhi, Saeedeh
    Almodarresi, S. M. T.
    Abootalebi, Vahid
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (06) : 839 - 845
  • [47] Deep Learning Based Human Activity Recognition Using Spatio-Temporal Image Formation of Skeleton Joints
    Tasnim, Nusrat
    Islam, Mohammad Khairul
    Baek, Joong-Hwan
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (06):
  • [48] Spatio-Temporal Crop Aggregation for Video Representation Learning
    Sameni, Sepehr
    Jenni, Simon
    Favaro, Paolo
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5641 - 5651
  • [49] Deconfounded Multimodal Learning for Spatio-temporal Video Grounding
    Wang, Jiawei
    Ma, Zhanchang
    Cao, Da
    Le, Yuquan
    Xiao, Junbin
    Chua, Tat-Seng
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7521 - 7529
  • [50] Spatio-temporal transform based video hashing
    Coskun, Baris
    Sankur, Bulent
    Memon, Nasir
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (06) : 1190 - 1208