Video Summarization With Attention-Based Encoder-Decoder Networks

被引:180
|
作者
Ji, Zhong [1 ]
Xiong, Kailin [1 ]
Pang, Yanwei [1 ]
Li, Xuelong [2 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoding; Visualization; Recurrent neural networks; Additives; Indexes; Internet; Semantics; Video summarization; LSTM; encoder-decoder; attention mechanism; DISCOVERY; FRAMEWORK; QUERY;
D O I
10.1109/TCSVT.2019.2904996
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence learning problem, where the input is a sequence of original video frames, and the output is a keyshot sequence. Our key idea is to learn a deep summarization network with attention mechanism to mimic the way of selecting the keyshots of human. To this end, we propose a novel video summarization framework named attentive encoder-decoder networks for video summarization (AVS), in which the encoder uses a bidirectional long short-term memory (BiLSTM) to encode the contextual information among the input video frames. As for the decoder, two attention-based LSTM networks are explored by using additive and multiplicative objective functions, respectively. Extensive experiments are conducted on two video summarization benchmark datasets, i.e., SumMe and TVSum. The results demonstrate the superiority of the proposed AVS-based approaches against the state-of-the-art approaches, with remarkable improvements on both datasets.
引用
收藏
页码:1709 / 1717
页数:9
相关论文
共 50 条
  • [1] Dense Video Captioning with Hierarchical Attention-Based Encoder-Decoder Networks
    Yu, Mingjing
    Zheng, Huicheng
    Liu, Zehua
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [2] Attention-based encoder-decoder networks for workflow recognition
    Min Zhang
    Haiyang Hu
    Zhongjin Li
    Jie Chen
    [J]. Multimedia Tools and Applications, 2021, 80 : 34973 - 34995
  • [3] Attention-based encoder-decoder networks for workflow recognition
    Zhang, Min
    Hu, Haiyang
    Li, Zhongjin
    Chen, Jie
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 34973 - 34995
  • [4] Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks
    Cho, Kyunghyun
    Courville, Aaron
    Bengio, Yoshua
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) : 1875 - 1886
  • [5] Multiple attention-based encoder-decoder networks for gas meter character recognition
    Li, Weidong
    Wang, Shuai
    Ullah, Inam
    Zhang, Xuehai
    Duan, Jinlong
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [6] Pooling Attention-based Encoder-Decoder Network for semantic segmentation
    Xu, Haixia
    Huang, Yunjia
    Hancock, Edwin R.
    Wang, Shuailong
    Xuan, Qijun
    Zhou, Wei
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 93
  • [7] A Dual Attention Encoder-Decoder Text Summarization Model
    Hakami, Nada Ali
    Mahmoud, Hanan Ahmed Hosni
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 3697 - 3710
  • [8] ATTENTION-BASED ENCODER-DECODER NETWORK FOR SINGLE IMAGE DEHAZING
    Gao, Shunan
    Zhu, Jinghua
    Xi, Heran
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [9] Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition
    Prabu, S.
    Sundar, K. Joseph Abraham
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 35 (02): : 2071 - 2086
  • [10] Understanding attention-based encoder-decoder networks: a case study with chess scoresheet recognition
    Hayashi, Sergio Y.
    Hirata, Nina S. T.
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 1586 - 1592