Refocused Attention: Long Short-Term Rewards Guided Video Captioning

被引:1
|
作者
Dong, Jiarong [1 ,2 ]
Gao, Ke [1 ]
Chen, Xiaokai [1 ,2 ]
Cao, Juan [1 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
Video captioning; Hierarchical attention; Reinforcement learning; Reward;
D O I
10.1007/s11063-019-10030-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The adaptive cooperation of visual model and language model is essential for video captioning. However, due to the lack of proper guidance for each time step in end-to-end training, the over-dependence of language model often results in the invalidation of attention-based visual model, which is called 'Attention Defocus' problem in this paper. Based on an important observation that the recognition precision of entity word can reflect the effectiveness of the visual model, we propose a novel strategy called refocused attention to optimize the training and cooperating of visual model and language model, using ingenious guidance at appropriate time step. The strategy consists of a short-term-reward guided local entity recognition and a long-term-reward guided global relation understanding, neither requires any external training data. Moreover, a framework with hierarchical visual representations and hierarchical attention is established to fully exploit the potential strength of the proposed learning strategy. Extensive experiments demonstrate that the ingenious guidance strategy together with the optimized structure outperform state-of-the-art video captioning methods with relative improvements 7.7% in BLEU-4 and 5.0% in CIDEr-D on MSVD dataset, even without multi-modal features.
引用
收藏
页码:935 / 948
页数:14
相关论文
共 50 条
  • [41] Short-Term Traffic Congestion Forecasting Using Attention-Based Long Short-Term Memory Recurrent Neural Network
    Zhang, Tianlin
    Liu, Ying
    Cui, Zhenyu
    Leng, Jiaxu
    Xie, Weihong
    Zhang, Liang
    COMPUTATIONAL SCIENCE - ICCS 2019, PT III, 2019, 11538 : 304 - 314
  • [42] Can Eruptions Be Predicted? Short-Term Prediction of Volcanic Eruptions via Attention-Based Long Short-Term Memory
    Le, Hiep, V
    Murata, Tsuyoshi
    Iguchi, Masato
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 13320 - 13325
  • [43] A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control
    Dong Xiang
    Zhang Jing
    Cheng Long
    Xu WenJun
    Su Hang
    Mei Tao
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2022, 65 (10) : 2409 - 2419
  • [44] A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control
    DONG Xiang
    ZHANG Jing
    CHENG Long
    XU WenJun
    SU Hang
    MEI Tao
    Science China(Technological Sciences), 2022, 65 (10) : 2409 - 2419
  • [45] A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control
    Xiang Dong
    Jing Zhang
    Long Cheng
    WenJun Xu
    Hang Su
    Tao Mei
    Science China Technological Sciences, 2022, 65 : 2409 - 2419
  • [46] A policy gradient algorithm integrating long and short-term rewards for soft continuum arm control
    DONG Xiang
    ZHANG Jing
    CHENG Long
    XU WenJun
    SU Hang
    MEI Tao
    Science China(Technological Sciences), 2022, (10) : 2409 - 2419
  • [47] A SHORT-TERM IN LONG-TERM
    KELLY, FL
    AMERICAN JOURNAL OF NURSING, 1988, 88 (11) : 1479 - 1480
  • [48] Short-term psychological effects of interactive video game technology exercise on mood and attention
    Russell, William D.
    Newton, Mark
    EDUCATIONAL TECHNOLOGY & SOCIETY, 2008, 11 (02): : 294 - 308
  • [49] Short-Term Photovoltaic Power Forecasting Based on Long Short Term Memory Neural Network and Attention Mechanism
    Zhou, Hangxia
    Zhang, Yujin
    Yang, Lingfan
    Liu, Qian
    Yan, Ke
    Du, Yang
    IEEE ACCESS, 2019, 7 : 78063 - 78074
  • [50] Hierarchical attention based long short-term memory for Chinese lyric generation
    Wu, Xing
    Du, Zhikang
    Guo, Yike
    Fujita, Hamido
    APPLIED INTELLIGENCE, 2019, 49 (01) : 44 - 52