Multimodal Local-Global Attention Network for Affective Video Content Analysis

被引:37
|
作者
Ou, Yangjun [1 ]
Chen, Zhenzhong [1 ]
Wu, Feng [2 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Peoples R China
关键词
Visualization; Task analysis; Psychology; Feature extraction; Hidden Markov models; Analytical models; Brain modeling; Affective content analysis; multimodal learning; attention; EMOTION RECOGNITION; MODEL; REPRESENTATION; INTEGRATION; DATABASE;
D O I
10.1109/TCSVT.2020.3014889
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid development of video distribution and broadcasting, affective video content analysis has attracted a lot of research and development activities recently. Predicting emotional responses of movie audiences is a challenging task in affective computing, since the induced emotions can be considered relatively subjective. In this article, we propose a multimodal local-global attention network (MMLGAN) for affective video content analysis. Inspired by the multimodal integration effect, we extend the attention mechanism to multi-level fusion and design a multimodal fusion unit to obtain a global representation of affective video. The multimodal fusion unit selects key parts from multimodal local streams in the local attention stage and captures the information distribution across time in the global attention stage. Experiments on the LIRIS-ACCEDE dataset, the MediaEval 2015 and 2016 datasets, the FilmStim dataset, the DEAP dataset and the VideoEmotion dataset demonstrate the effectiveness of our approach when compared with the state-of-the-art methods.
引用
收藏
页码:1901 / 1914
页数:14
相关论文
共 50 条
  • [1] MLG-NCS: Multimodal Local-Global Neuromorphic Computing System for Affective Video Content Analysis
    Ji, Xiaoyue
    Dong, Zhekang
    Zhou, Guangdong
    Lai, Chun Sing
    Qi, Donglian
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (08): : 5137 - 5149
  • [2] Affective Video Content Analysis via Multimodal Deep Quality Embedding Network
    Zhu, Yaochen
    Chen, Zhenzhong
    Wu, Feng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (03) : 1401 - 1415
  • [3] Local-Global Fusion Network for Video Super-Resolution
    Su, Dewei
    Wang, Hua
    Jin, Longcun
    Sun, Xianfang
    Peng, Xinyi
    IEEE ACCESS, 2020, 8 : 172443 - 172456
  • [4] Local-Global Dynamic Filtering Network for Video Super-Resolution
    Zhang, Chaopeng
    Wang, Xingtao
    Xiong, Ruiqin
    Fan, Xiaopeng
    Zhao, Debin
    IEEE TRANSACTIONS ON COMPUTATIONAL IMAGING, 2023, 9 : 963 - 976
  • [5] Weakly Supervised Local-Global Attention Network for Facial Expression Recognition
    Zhang, Haifeng
    Su, Wen
    Wang, Zengfu
    IEEE ACCESS, 2020, 8 (08): : 37976 - 37987
  • [6] A Multimodal Deep Regression Bayesian Network for Affective Video Content Analyses
    Gan, Quan
    Wang, Shangfei
    Hao, Longfei
    Ji, Qiang
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5123 - 5132
  • [7] Representation Learning through Multimodal Attention and Time-Sync Comments for Affective Video Content Analysis
    Pan, Jicai
    Wang, Shangfei
    Fang, Lin
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [8] Multimodal Deep Denoise Framework for Affective Video Content Analysis
    Zhu, Yaochen
    Chen, Zhenzhong
    Wu, Feng
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 130 - 138
  • [9] Co-attention Guided Local-Global Feature Fusion for Aspect-Level Multimodal Sentiment Analysis
    Cai, Guoyong
    Wang, Shunjie
    Lv, Guangrui
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 370 - 382
  • [10] Fully Convolutional Transformer with Local-Global Attention
    Lee, Sihaeng
    Yi, Eojindl
    Lee, Janghyeon
    Yoo, Jinsu
    Lee, Honglak
    Kim, Seung Hwan
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 552 - 559