Efficient Transformer for Video Summarization

被引:0
|
作者
Kolmakova, Tatiana [1 ]
Makarov, Ilya [2 ,3 ]
机构
[1] HSE Univ, Moscow, Russia
[2] Artificial Intelligence Res Inst AIRI, Moscow, Russia
[3] NUST MISiS, AI Ctr, Moscow, Russia
关键词
Video Summarization; Deep Learning; Transformers; CREATION;
D O I
10.1007/978-3-031-43078-7_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The amount of user-generated content is increasing daily. That is especially true for video content that became popular with social media like TikTok. Other internet sources keep up and easier the way for video sharing. That is why automatic tools for finding core information of content but decreasing its volume are essential. Video summarization is aimed to help with it. In this work, we propose a transformer-based approach to supervised video summarization. Previous applications of attention architectures either used lighter versions or loaded models with RNN modules, that slower computations. Our proposed framework uses all advantages of transformers. Extensive evaluation on two benchmark datasets showed that the introduced model outperform existed approaches on the SumMe dataset by 3% and shows comparable results on the TVSum dataset.
引用
收藏
页码:52 / 65
页数:14
相关论文
共 50 条
  • [31] Efficient Video Shot Summarization Using an Enhanced Spectral Clustering Approach
    Chasanis, Vasilcios
    Likas, Aristidis
    Galatsanos, Nikolaos
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 847 - 856
  • [32] Temporally Efficient Vision Transformer for Video Instance Segmentation
    Yang, Shusheng
    Wang, Xinggang
    Li, Yu
    Fang, Yuxin
    Fang, Jiemin
    Liu, Wenyu
    Zhao, Xun
    Shan, Ying
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2875 - 2885
  • [33] Echocardiogram video summarization
    Ebadollahi, S
    Chang, SF
    Wu, H
    Takoma, S
    MEDICAL IMAGING 2001: ULTRASONIC IMAGING AND SIGNAL PROCESSING, 2001, 4325 : 492 - 501
  • [34] Dynamic video summarization of home video
    Lienhart, R
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2000, 2000, 3972 : 378 - 389
  • [35] Video Summarization Overview
    Otani, Mayu
    Song, Yale
    Wang, Yang
    FOUNDATIONS AND TRENDS IN COMPUTER GRAPHICS AND VISION, 2022, 13 (04): : 284 - 335
  • [36] Hierarchical video summarization
    Ratakonda, K
    Sezan, MI
    Crinon, R
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING '99, PARTS 1-2, 1998, 3653 : 1531 - 1541
  • [37] Video retrieval and summarization
    Sebe, N
    Lew, MS
    Smeulders, AWM
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2003, 92 (2-3) : 141 - 146
  • [38] AudioVisual Video Summarization
    Zhao, Bin
    Gong, Maoguo
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 5181 - 5188
  • [39] Video Co-summarization: Video Summarization by Visual Co-occurrence
    Chu, Wen-Sheng
    Song, Yale
    Jaimes, Alejandro
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3584 - 3592
  • [40] FastPicker: Adaptive independent two-stage video-to-video summarization for efficient action recognition
    Alfasly, Saghir
    Lu, Jian
    Xu, Chen
    Al-Huda, Zaid
    Jiang, Qingtang
    Lu, Zhaosong
    Chui, Charles K.
    NEUROCOMPUTING, 2023, 516 : 231 - 244