Multimodal-Based and Aesthetic-Guided Narrative Video Summarization

被引:5
|
作者
Xie, Jiehang [1 ]
Chen, Xuanbai [1 ]
Zhang, Tianyi [2 ]
Zhang, Yixuan [1 ]
Lu, Shao-Ping [1 ]
Cesar, Pablo [2 ]
Yang, Yulu [1 ]
机构
[1] Nankai Univ, TKLNDST, CS, Nankai 300071, Peoples R China
[2] Ctr Wiskunde & Informat, NL-098 XG Amsterdam, Netherlands
关键词
Narrative video summarization; multimodal information; aesthetic guidance;
D O I
10.1109/TMM.2022.3183394
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Narrative videos usually illustrate the main content through multiple narrative information such as audios, video frames and subtitles. Existing video summarization approaches rarely consider the multiple dimensional narrative inputs, or ignore the impact of shots artistic assembly when directly applied to narrative videos. This paper introduces a multimodal-based and aesthetic-guided narrative video summarization method. Our method leverages multimodal information including visual content, subtitles and audio information through our specified key shots selection, subtitle summarization, and highlight extraction components. Furthermore, under the guidance of cinematographic aesthetic, we design a novel shots assembly module to ensure the shot content completeness and then assemble the selected shots into a desired summary. Besides, our method also provides the flexible specification for shots selection, to achieve which it automatically selects semantically related shots according to the user-designed text. By conducting a large number of quantitative experimental evaluations and user studies, we demonstrate that our method effectively preserves important narrative information of the original video, and it is capable of rapidly producing high-quality and aesthetic-guided narrative video summaries.
引用
收藏
页码:4894 / 4908
页数:15
相关论文
共 50 条
  • [1] A Knowledge Augmented and Multimodal-Based Framework for Video Summarization
    Xie, Jiehang
    Chen, Xuanbai
    Lu, Shao-Ping
    Yang, Yulu
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [2] Aesthetic-guided Outward Image Cropping
    Zhong, Lei
    Li, Feng-Heng
    Huang, Hao-Zhi
    Zhang, Yong
    Lu, Shao-Ping
    Wang, Jue
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (06):
  • [3] DEEP LEARNING FOR MULTIMODAL-BASED VIDEO INTERESTINGNESS PREDICTION
    Shen, Yuesong
    Demarty, Claire-Helene
    Duong, Ngoc Q. K.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 1003 - 1008
  • [4] Video Summarization Based on Multimodal Features
    Zhang, Yu
    Liu, Ju
    Liu, Xiaoxi
    Gao, Xuesong
    [J]. INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2020, 11 (04): : 60 - 76
  • [5] Video semantic concept discovery using multimodal-based association classification
    Lin, Lin
    Ravitz, Guy
    Shyu, Mei-Ling
    Chen, Shu-Ching
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 859 - +
  • [6] Interactive System for Video Summarization Based on Multimodal Fusion
    Zheng Li
    Xiaobing Du
    Cuixia Ma
    Yanfeng Li
    Hongan Wang
    [J]. Journal of Beijing Institute of Technology, 2019, 28 (01) : 27 - 34
  • [7] Multimodal Video Summarization based on Fuzzy Similarity Features
    Psallidas, Theodoros
    Vasilakakis, Michael D.
    Spyrou, Evaggelos
    Iakovidis, Dimitris K.
    [J]. 2022 IEEE 14TH IMAGE, VIDEO, AND MULTIDIMENSIONAL SIGNAL PROCESSING WORKSHOP (IVMSP), 2022,
  • [8] Repurposing existing deep networks for caption and aesthetic-guided image cropping
    Horanyi, Nora
    Xia, Kedi
    Yi, Kwang Moo
    Bojja, Abhishake Kumar
    Leonardis, Ales
    Chang, Hyung Jin
    [J]. PATTERN RECOGNITION, 2022, 126
  • [9] MLASK: Multimodal Summarization of Video-based News Articles
    Krubinski, Mateusz
    Pecina, Pavel
    [J]. 17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 910 - 924
  • [10] Topic-guided abstractive multimodal summarization with multimodal output
    Rafi, Shaik
    Das, Ranjita
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023,