Divided Caption Model with Global Attention

被引:0
|
作者
Chen, Yamin [1 ]
Dua, Hancong [1 ]
Zhao, Zitian [1 ]
Wang, Zhi [1 ]
机构
[1] Univ Elect Sci & Technol China, Chengdu, Peoples R China
基金
中国国家自然科学基金;
关键词
Video Caption; Global Attention; Bidirectional LSTM;
D O I
10.1145/3461353.3461386
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dense video captioning is a newly emerging task that aims at both locating and describing all events in a video. We identify and tackle two challenges on this task, namely, 1) the limitation of just attending local features; 2) the severely degraded description and increased training complexity caused by the redundant information. In this paper, we propose a new divided caption model, where two different attention mechanisms are introduced to rectify the captioning process in a unified framework. Firstly, we employ a global attention mechanism to encode video features in the proposal module, which can obtain a better temporal boundary. Second, we design bidirectional Long short-term memory (LSTM) with a common-attention mechanism to counterpoise 3d-convolutional neural network (c3d) features and global attention video content effectively in caption module to generate coherent natural language descriptions. Besides, we divide forward and backward video features in an event into segments to relieve the stress for degraded description and increased complexity. Extensive experiments demonstrate the competitive performance of the proposed Divided Caption Model with Global Attention (DCM-GA) over state-of-the-art methods on the ActivityNet Captions dataset.
引用
收藏
页码:69 / 75
页数:7
相关论文
共 50 条
  • [1] Image Caption with Global-Local Attention
    Li, Linghui
    Tang, Sheng
    Deng, Lixi
    Zhang, Yongdong
    Tian, Qi
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4133 - 4139
  • [2] Improved method for image caption with global attention mechanism
    Ma, Shulei
    Zhang, Guobin
    Jiao, Yang
    Shi, Guangming
    [J]. Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (02): : 17 - 22
  • [3] Research for image caption based on global attention mechanism
    Tong, Wu
    Tao, Ku
    Hao, Zhang
    [J]. SECOND TARGET RECOGNITION AND ARTIFICIAL INTELLIGENCE SUMMIT FORUM, 2020, 11427
  • [4] Image Caption Generation Using Attention Model
    Ramalakshmi, Eliganti
    Jain, Moksh Sailesh
    Uddin, Mohammed Ameer
    [J]. INNOVATIVE DATA COMMUNICATION TECHNOLOGIES AND APPLICATION, ICIDCA 2021, 2022, 96 : 1009 - 1017
  • [5] Multilayer Dense Attention Model for Image Caption
    Wang, Eric Ke
    Zhang, Xun
    Wang, Fan
    Wu, Tsu-Yang
    Chen, Chien-Ming
    [J]. IEEE ACCESS, 2019, 7 : 66358 - 66368
  • [6] Neural Image Caption Generation with Global Feature Based Attention Scheme
    Wang, Yongzhuang
    Xiong, Hongkai
    [J]. IMAGE AND GRAPHICS (ICIG 2017), PT II, 2017, 10667 : 51 - 61
  • [7] AN INTERACTIVE RACE MODEL OF DIVIDED ATTENTION
    MORDKOFF, JT
    YANTIS, S
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1991, 17 (02) : 520 - 538
  • [8] The effect of divided attention on global judgment of learning accuracy
    Barnes, Kelly Anne
    Dougherty, Michael R.
    [J]. AMERICAN JOURNAL OF PSYCHOLOGY, 2007, 120 (03): : 347 - 359
  • [9] Recurrent Attention LSTM Model for Image Chinese Caption Generation
    Zhang, Chaoying
    Dai, Yaping
    Cheng, Yanyan
    Jia, Zhiyang
    Hirota, Kaoru
    [J]. 2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 808 - 813
  • [10] An image caption model based on attention mechanism and deep reinforcement learning
    Bai, Tong
    Zhou, Sen
    Pang, Yu
    Luo, Jiasai
    Wang, Huiqian
    Du, Ya
    [J]. FRONTIERS IN NEUROSCIENCE, 2023, 17