Multimodal Local-Global Attention Network for Affective Video Content Analysis

被引:37
|
作者
Ou, Yangjun [1 ]
Chen, Zhenzhong [1 ]
Wu, Feng [2 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230027, Peoples R China
关键词
Visualization; Task analysis; Psychology; Feature extraction; Hidden Markov models; Analytical models; Brain modeling; Affective content analysis; multimodal learning; attention; EMOTION RECOGNITION; MODEL; REPRESENTATION; INTEGRATION; DATABASE;
D O I
10.1109/TCSVT.2020.3014889
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the rapid development of video distribution and broadcasting, affective video content analysis has attracted a lot of research and development activities recently. Predicting emotional responses of movie audiences is a challenging task in affective computing, since the induced emotions can be considered relatively subjective. In this article, we propose a multimodal local-global attention network (MMLGAN) for affective video content analysis. Inspired by the multimodal integration effect, we extend the attention mechanism to multi-level fusion and design a multimodal fusion unit to obtain a global representation of affective video. The multimodal fusion unit selects key parts from multimodal local streams in the local attention stage and captures the information distribution across time in the global attention stage. Experiments on the LIRIS-ACCEDE dataset, the MediaEval 2015 and 2016 datasets, the FilmStim dataset, the DEAP dataset and the VideoEmotion dataset demonstrate the effectiveness of our approach when compared with the state-of-the-art methods.
引用
收藏
页码:1901 / 1914
页数:14
相关论文
共 50 条
  • [21] Deep Local-Global Refinement Network for Stent Analysis in IVOCT Images
    Guo, Yuyu
    Bi, Lei
    Kumar, Ashnil
    Gao, Yue
    Zhang, Ruiyan
    Feng, Dagan
    Wang, Qian
    Kim, Jinman
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT V, 2019, 11768 : 539 - 546
  • [22] Local-global dual attention network (LGANet) for population estimation using remote sensing imagery
    Jiang, Yanxiao
    Huang, Zhou
    Li, Linna
    Dong, Quanhua
    RESOURCES ENVIRONMENT AND SUSTAINABILITY, 2023, 14
  • [23] Image Captioning for Nantong Blue Calico Through Stacked Local-Global Channel Attention Network
    Guo, Chenyi
    Zhang, Li
    Yu, Xiang
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 357 - 372
  • [24] MULTIMODAL SEMANTIC ATTENTION NETWORK FOR VIDEO CAPTIONING
    Sun, Liang
    Li, Bing
    Yuan, Chunfeng
    Zha, Zhengjun
    Hu, Weiming
    2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1300 - 1305
  • [25] Voice activity detection using a local-global attention model
    Li, Shu
    Li, Ye
    Feng, Tao
    Shi, Jinze
    Zhang, Peng
    APPLIED ACOUSTICS, 2022, 195
  • [26] DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
    Liang, Yuxuan
    Zhou, Pan
    Zimmermann, Roger
    Yan, Shuicheng
    COMPUTER VISION, ECCV 2022, PT XXXIV, 2022, 13694 : 577 - 595
  • [27] Toward an audiovisual attention model for multimodal video content
    Sidaty, Naty
    Larabi, Mohamed-Chaker
    Saadane, Abdelhakim
    NEUROCOMPUTING, 2017, 259 : 94 - 111
  • [28] Hybrid Local-Global Context Learning for Neural Video Compression
    Zhai, Yongqi
    Yang, Jiayu
    Jiang, Wei
    Yang, Chunhui
    Tang, Luyang
    Wang, Ronggang
    2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 322 - 331
  • [29] LgNet: A Local-Global Network for Action Recognition and Beyond
    Zhou, Jiaqi
    Fu, Zehua
    Huang, Qiuyu
    Liu, Qingjie
    Wang, Yunhong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5192 - 5205
  • [30] Local-Global Memory Neural Network for Medication Prediction
    Song, Jun
    Wang, Yueyang
    Tang, Siliang
    Zhang, Yin
    Chen, Zhigang
    Zhang, Zhongfei
    Zhang, Tong
    Wu, Fei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (04) : 1723 - 1736