Multimodal Local-Global Attention Network for Affective Video Content Analysis

被引:37
|
作者
Ou, Yangjun [1 ]
Chen, Zhenzhong [1 ]
Wu, Feng [2 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Peoples R China
关键词
Visualization; Task analysis; Psychology; Feature extraction; Hidden Markov models; Analytical models; Brain modeling; Affective content analysis; multimodal learning; attention; EMOTION RECOGNITION; MODEL; REPRESENTATION; INTEGRATION; DATABASE;
D O I
10.1109/TCSVT.2020.3014889
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid development of video distribution and broadcasting, affective video content analysis has attracted a lot of research and development activities recently. Predicting emotional responses of movie audiences is a challenging task in affective computing, since the induced emotions can be considered relatively subjective. In this article, we propose a multimodal local-global attention network (MMLGAN) for affective video content analysis. Inspired by the multimodal integration effect, we extend the attention mechanism to multi-level fusion and design a multimodal fusion unit to obtain a global representation of affective video. The multimodal fusion unit selects key parts from multimodal local streams in the local attention stage and captures the information distribution across time in the global attention stage. Experiments on the LIRIS-ACCEDE dataset, the MediaEval 2015 and 2016 datasets, the FilmStim dataset, the DEAP dataset and the VideoEmotion dataset demonstrate the effectiveness of our approach when compared with the state-of-the-art methods.
引用
收藏
页码:1901 / 1914
页数:14
相关论文
共 50 条
  • [31] Local-Global Interactive Network For Face Age Transformation
    Song, Jie
    Wei, Ping
    Li, Huan
    Zhang, Yongchi
    Zheng, Nanning
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 9258 - 9264
  • [32] SDPN: A Slight Dual-Path Network With Local-Global Attention Guided for Medical Image Segmentation
    Wang, Jing
    Li, Shuyi
    Yu, Luyue
    Qu, Aixi
    Wang, Qing
    Liu, Ju
    Wu, Qiang
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (06) : 2956 - 2967
  • [33] Global Local Fusion Neural Network for Multimodal Sentiment Analysis
    Hu, Xiaoran
    Yamamura, Masayuki
    APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [34] Local-global analysis of cooling tower with cutouts
    Hara, T
    Gould, PL
    COMPUTERS & STRUCTURES, 2002, 80 (27-30) : 2157 - 2166
  • [35] LGFF-Net: Airport Video Object Segmentation based on Local-Global Feature Fusion Network
    Wu, Honggang
    Li, Wenjing
    Wu, Min
    Zhang, Xiang
    PROCEEDINGS OF 2020 IEEE 2ND INTERNATIONAL CONFERENCE ON CIVIL AVIATION SAFETY AND INFORMATION TECHNOLOGY (ICCASIT), 2020, : 746 - 752
  • [36] Looming fear stimuli broadens attention in a local-global letter task
    Bellaera, Lauren
    von Muhlenen, Adrian
    EMOTION AND COGNITION, 2019, 247 : 47 - 87
  • [37] Auditory attention to frequency and time: an analogy to visual local-global stimuli
    Justus, T
    List, A
    COGNITION, 2005, 98 (01) : 31 - 51
  • [38] A Local-Global Attention Fusion Framework With Tensor Decomposition for Medical Diagnosis
    Peishu Wu
    Han Li
    Liwei Hu
    Jirong Ge
    Nianyin Zeng
    IEEE/CAA Journal of Automatica Sinica, 2024, 11 (06) : 1536 - 1538
  • [39] Content Based Video Retrival System for Mexican Culture Heritage based on Object Matching and Local-Global Descriptors
    Cedillo-Hernandez, Manuel
    Javier Garcia-Ugalde, Francisco
    Cedillo-Hernandez, Antonio
    Nakano-Miyatake, Mariko
    Perez-Meana, Hector
    2014 INTERNATIONAL CONFERENCE ON MECHATRONICS, ELECTRONICS AND AUTOMOTIVE ENGINEERING (ICMEAE), 2014, : 38 - 43
  • [40] Encouraging the Perceptual Underdog: Positive Affective Priming of Nonpreferred Local-Global Processes
    Tan, Hannah K.
    Jones, Gregory V.
    Watson, Derrick G.
    EMOTION, 2009, 9 (02) : 238 - 247