Unsupervised Video Summarization with Attentive Conditional Generative Adversarial Networks

被引:51
|
作者
He, Xufeng [1 ]
Hua, Yang [2 ]
Song, Tao [1 ]
Zhang, Zongpu [1 ]
Xue, Zhengui [1 ]
Ma, Ruhui [1 ]
Robertson, Neil [2 ]
Guan, Haibing [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Queens Univ Belfast, Belfast, Antrim, North Ireland
关键词
video summarization; generative adversarial networks; video analysis; deep learning;
D O I
10.1145/3343031.3351056
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With the rapid growth of video data, video summarization technique plays a key role in reducing people's efforts to explore the content of videos by generating concise but informative summaries. Though supervised video summarization approaches have been well studied and achieved state-of-the-art performance, unsupervised methods are still highly demanded due to the intrinsic difficulty of obtaining high-quality annotations. In this paper, we propose a novel yet simple unsupervised video summarization method with attentive conditional Generative Adversarial Networks (GANs). Firstly, we build our framework upon Generative Adversarial Networks in an unsupervised manner. Specifically, the generator produces high-level weighted frame features and predicts frame-level importance scores, while the discriminator tries to distinguish between weighted frame features and raw frame features. Furthermore, we utilize a conditional feature selector to guide GAN model to focus on more important temporal regions of the whole video frames. Secondly, we are the first to introduce the frame-level multi-head self-attention for video summarization, which learns long-range temporal dependencies along the whole video sequence and overcomes the local constraints of recurrent units, e.g., LSTMs. Extensive evaluations on two datasets, SumMe and TVSum, show that our proposed framework surpasses state-of-the-art unsupervised methods by a large margin, and even outperforms most of the supervised methods. Additionally, we also conduct the ablation study to unveil the influence of each component and parameter settings in our framework.
引用
收藏
页码:2296 / 2304
页数:9
相关论文
共 50 条
  • [1] Recurrent generative adversarial networks for unsupervised WCE video summarization
    Lan, Libin
    Ye, Chunxiao
    KNOWLEDGE-BASED SYSTEMS, 2021, 222
  • [2] Unsupervised Video Summarization with Adversarial LSTM Networks
    Mahasseni, Behrooz
    Lam, Michael
    Todorovic, Sinisa
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2982 - 2991
  • [3] Attentive and Adversarial Learning for Video Summarization
    Fu, Tsu-Jui
    Tai, Shao-Heng
    Chen, Hwann-Tzong
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1579 - 1587
  • [4] Recursive Conditional Generative Adversarial Networks for Video Transformation
    Kim, San
    Suh, Doug Young
    IEEE ACCESS, 2019, 7 : 37807 - 37821
  • [5] Unsupervised Tumor Characterization via Conditional Generative Adversarial Networks
    Quoc Dang Vu
    Kim, Kyungeun
    Kwak, Jin Tae
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (02) : 348 - 357
  • [6] Video summarization with a convolutional attentive adversarial network
    Liang, Guoqiang
    Lv, Yanbing
    Li, Shucheng
    Zhang, Shizhou
    Zhang, Yanning
    PATTERN RECOGNITION, 2022, 131
  • [7] AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization
    Apostolidis, Evlampios
    Adamantidou, Eleni
    Metsai, Alexandros, I
    Mezaris, Vasileios
    Patras, Ioannis
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (08) : 3278 - 3292
  • [8] Deep Semantic and Attentive Network for Unsupervised Video Summarization
    Zhong, Sheng-Hua
    Lin, Jingxu
    Lu, Jianglin
    Fares, Ahmed
    Ren, Tongwei
    ACM Transactions on Multimedia Computing, Communications and Applications, 2022, 18 (02)
  • [9] Unsupervised Video Summarization With Cycle-Consistent Adversarial LSTM Networks
    Yuan, Li
    Tay, Francis Eng Hock
    Li, Ping
    Feng, Jiashi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (10) : 2711 - 2722
  • [10] Self-Supervised Attentive Generative Adversarial Networks for Video Anomaly Detection
    Huang, Chao
    Wen, Jie
    Xu, Yong
    Jiang, Qiuping
    Yang, Jian
    Wang, Yaowei
    Zhang, David
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 9389 - 9403