DUAL FOCUS ATTENTION NETWORK FOR VIDEO EMOTION RECOGNITION

被引:0
|
作者
Qiu, Haonan [1 ]
He, Liang [1 ]
Wang, Feng [1 ]
机构
[1] East China Normal Univ, Sch Comp Sci & Technol, Shanghai Key Lab Multidimens Informat Proc, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Video emotion recognition; attention for video; deep learning;
D O I
10.1109/icme46284.2020.9102808
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Video emotion recognition is a challenging task due to complex scenes and various forms of emotion expression. Most existing works focus on fusing multiple features over the whole video clips. According to our observations, given a long video clip, the emotion is usually presented by only several actions/objects in a few short snippets, and the meaningful cues are buried in the noisy background. When human judging the emotion in videos, we first find the informative clips and then closely look for emotional cues in the frames. In this paper, we propose Dual Focus Attention Network to mimic this process. First, three kinds of features including action, object, and scene are extracted from videos. Second, Two attention modules are used to focus on the visual features of the videos from temporal and spatial dimensions respectively. With our dual focus attention network, we can effectively discover the most emotional frames along the time dimension and the most emotional visual cues in each frame. Our experiments conducted on two widely used datasets Ekman and VideoEmotion show that our proposed approach outperforms the existing approaches.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] A Two-Stage Spatiotemporal Attention Convolution Network for Continuous Dimensional Emotion Recognition From Facial Video
    Hu, Min
    Chu, Qian
    Wang, Xiaohua
    He, Lei
    Ren, Fuji
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 698 - 702
  • [22] Attention Based Fully Convolutional Network for Speech Emotion Recognition
    Zhang, Yuanyuan
    Du, Jun
    Wang, Zirui
    Zhang, Jianshu
    Tu, Yanhui
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1771 - 1775
  • [23] A Joint Network Based on Interactive Attention for Speech Emotion Recognition
    Hu, Ying
    Hou, Shijing
    Yang, Huamin
    Huang, Hao
    He, Liang
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1715 - 1720
  • [24] Modularized composite attention network for continuous music emotion recognition
    Meixian Zhang
    Yonghua Zhu
    Wenjun Zhang
    Yunwen Zhu
    Tianyu Feng
    Multimedia Tools and Applications, 2023, 82 : 7319 - 7341
  • [25] Emotion Recognition Method Based on Multiscale Attention Residual Network
    Bo Zhan Jiao
    Yuanxin Fu
    Dang N.H. Mao
    Ning Thanh
    undefined Zhang
    Pattern Recognition and Image Analysis, 2024, 34 (4) : 1000 - 1006
  • [26] Modularized composite attention network for continuous music emotion recognition
    Zhang, Meixian
    Zhu, Yonghua
    Zhang, Wenjun
    Zhu, Yunwen
    Feng, Tianyu
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7319 - 7341
  • [27] EEG Emotion Recognition Network Based on Attention and Spatiotemporal Convolution
    Zhu, Xiaoliang
    Liu, Chen
    Zhao, Liang
    Wang, Shengming
    SENSORS, 2024, 24 (11)
  • [28] Recurrent Region Attention and Video Frame Attention Based Video Action Recognition Network Design
    Sang H.-F.
    Zhao Z.-Y.
    He D.-K.
    Zhao, Zi-Yu (Maikuraky1022@outlook.com), 1600, Chinese Institute of Electronics (48): : 1052 - 1061
  • [29] Efficient dual attention SlowFast networks for video action recognition
    Wei, Dafeng
    Tian, Ye
    Wei, Liqing
    Zhong, Hong
    Chen, Siqian
    Pu, Shiliang
    Lu, Hongtao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 222
  • [30] A Video Action Recognition Method via Dual-Stream Feature Fusion Neural Network with Attention
    Han J.
    Li J.
    International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 2024, 32 (04) : 673 - 694