Video Joint Modelling Based on Hierarchical Transformer for Co-Summarization

被引:14
|
作者
Li, Haopeng [1 ]
Ke, Qiuhong [2 ]
Gong, Mingming [3 ]
Zhang, Rui [4 ]
机构
[1] Univ Melbourne, Sch Comp & Informat Syst, Parkville, Vic 3010, Australia
[2] Monash Univ, Dept Data Sci & AI, Parkville, Vic 3010, Australia
[3] Univ Melbourne, Sch Math & Stat, Parkville, Vic 3010, Australia
[4] Tsinghua Univ, Beijing 100190, Peoples R China
关键词
Transformers; Semantics; Correlation; Computational modeling; Training; Task analysis; Video on demand; Video summarization; co-summarization; hierarchical transformer; representation reconstruction;
D O I
10.1109/TPAMI.2022.3186506
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video summarization aims to automatically generate a summary (storyboard or video skim) of a video, which can facilitate large-scale video retrieval and browsing. Most of the existing methods perform video summarization on individual videos, which neglects the correlations among similar videos. Such correlations, however, are also informative for video understanding and video summarization. To address this limitation, we propose Video Joint Modelling based on Hierarchical Transformer (VJMHT) for co-summarization, which takes into consideration the semantic dependencies across videos. Specifically, VJMHT consists of two layers of Transformer: the first layer extracts semantic representation from individual shots of similar videos, while the second layer performs shot-level video joint modelling to aggregate cross-video semantic information. By this means, complete cross-video high-level patterns are explicitly modelled and learned for the summarization of individual videos. Moreover, Transformer-based video representation reconstruction is introduced to maximize the high-level similarity between the summary and the original video. Extensive experiments are conducted to verify the effectiveness of the proposed modules and the superiority of VJMHT in terms of F-measure and rank-based evaluation.
引用
收藏
页码:3904 / 3917
页数:14
相关论文
共 50 条
  • [1] Video Co-summarization: Video Summarization by Visual Co-occurrence
    Chu, Wen-Sheng
    Song, Yale
    Jaimes, Alejandro
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2015, : 3584 - 3592
  • [2] Shot Level Egocentric Video Co-summarization
    Sahu, Abhimanyu
    Chowdhury, Ananda S.
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2887 - 2892
  • [3] Egocentric video co-summarization using transfer learning and refined random walk on a constrained graph
    Sahu, Abhimanyu
    Chowdhury, Ananda S.
    PATTERN RECOGNITION, 2022, 134
  • [4] Hierarchical video summarization based on video structure and highlight
    Geng, Yuliang
    Xu, De
    Feng, Songhe
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2006, 4109 : 226 - 234
  • [5] Hierarchical Time-Aware Summarization with an Adaptive Transformer for Video Captioning
    Cardoso, Leonardo Vilela
    Guimaraes, Silvio Jamil Ferzoli
    do Patrocinio Jr, Zenilton Kleber Goncalves
    INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING, 2023, 17 (04) : 569 - 592
  • [6] Hierarchical video summarization
    Ratakonda, K
    Sezan, MI
    Crinon, R
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING '99, PARTS 1-2, 1998, 3653 : 1531 - 1541
  • [7] Hierarchical video summarization based on context clustering
    Tseng, BL
    Smith, JR
    INTERNET MULTIMEDIA MANAGEMENT SYSTEMS IV, 2003, 5242 : 14 - 25
  • [8] Efficient Transformer for Video Summarization
    Kolmakova, Tatiana
    Makarov, Ilya
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT II, 2023, 14135 : 52 - 65
  • [9] A Static Video Summarization Method Based on Hierarchical Clustering
    Guimaraes, Silvio Jamil F.
    Gomes, Willer
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, 2010, 6419 : 46 - 54
  • [10] A Hierarchical Representation Model Based on Longformer and Transformer for Extractive Summarization
    Yang, Shihao
    Zhang, Shaoru
    Fang, Ming
    Yang, Fengqin
    Liu, Shuhua
    ELECTRONICS, 2022, 11 (11)