Multimodal-Based and Aesthetic-Guided Narrative Video Summarization

被引:5
|
作者
Xie, Jiehang [1 ]
Chen, Xuanbai [1 ]
Zhang, Tianyi [2 ]
Zhang, Yixuan [1 ]
Lu, Shao-Ping [1 ]
Cesar, Pablo [2 ]
Yang, Yulu [1 ]
机构
[1] Nankai Univ, TKLNDST, CS, Nankai 300071, Peoples R China
[2] Ctr Wiskunde & Informat, NL-098 XG Amsterdam, Netherlands
关键词
Narrative video summarization; multimodal information; aesthetic guidance;
D O I
10.1109/TMM.2022.3183394
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Narrative videos usually illustrate the main content through multiple narrative information such as audios, video frames and subtitles. Existing video summarization approaches rarely consider the multiple dimensional narrative inputs, or ignore the impact of shots artistic assembly when directly applied to narrative videos. This paper introduces a multimodal-based and aesthetic-guided narrative video summarization method. Our method leverages multimodal information including visual content, subtitles and audio information through our specified key shots selection, subtitle summarization, and highlight extraction components. Furthermore, under the guidance of cinematographic aesthetic, we design a novel shots assembly module to ensure the shot content completeness and then assemble the selected shots into a desired summary. Besides, our method also provides the flexible specification for shots selection, to achieve which it automatically selects semantically related shots according to the user-designed text. By conducting a large number of quantitative experimental evaluations and user studies, we demonstrate that our method effectively preserves important narrative information of the original video, and it is capable of rapidly producing high-quality and aesthetic-guided narrative video summaries.
引用
收藏
页码:4894 / 4908
页数:15
相关论文
共 50 条
  • [41] Smart Surveillance Based on Video Summarization
    Thomas, Sinnu Susan
    Gupta, Sumana
    Subramanian, Venkatesh K.
    [J]. 2017 IEEE REGION 10 INTERNATIONAL SYMPOSIUM ON TECHNOLOGIES FOR SMART CITIES (IEEE TENSYMP 2017), 2017,
  • [42] Human Based Surveillance Video Summarization
    Aydemir, M. Said
    Karsligil, M. Elif
    [J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
  • [43] Gesture-based video summarization
    Kosmopoulos, D
    Doulamis, A
    Doulamis, N
    [J]. 2005 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), VOLS 1-5, 2005, : 3213 - 3216
  • [44] Video summarization based on semantic representation
    Carlos, RP
    Uehara, K
    [J]. ADVANCED MULTIMEDIA CONTENT PROCESSING, 1999, 1554 : 1 - 16
  • [45] VIDEO SUMMARIZATION BASED ON LOCAL FEATURES
    Massaoudi, Mohamed
    Bahroun, Sahbi
    Zagrouba, Ezzeddine
    [J]. 25. INTERNATIONAL CONFERENCE IN CENTRAL EUROPE ON COMPUTER GRAPHICS, VISUALIZATION AND COMPUTER VISION (WSCG 2017), 2017, 2701 : 13 - 17
  • [46] Video Summarization Based on Optical Flow
    Jadhav, Dipti
    Bhosle, Udhav
    [J]. ADVANCED COMPUTING AND INTELLIGENT ENGINEERING, 2020, 1082 : 333 - 342
  • [47] MMFG: Multimodal-based Mutual Feature Gating 3D Object Detection
    Xu, Wanpeng
    Fu, Zhipeng
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (02)
  • [48] Multimodal-based shape optimization of a rectangular horn mounted in an enclosure for optimum impedance matching
    Xiao, He
    Dong, Hao
    Lyu, Yuzhen
    Feng, Xuelei
    Shen, Yong
    [J]. APPLIED ACOUSTICS, 2023, 214
  • [49] Efficient multimodal-based shape optimization of acoustic horns with application to subwavelength perfect transmission
    Dong, Hao
    Doc, Jean-Baptiste
    Felix, Simon
    [J]. JOURNAL OF SOUND AND VIBRATION, 2023, 559
  • [50] Interactive and Multimodal-based Augmented Reality for Remote Assistance using a Digital Surgical Microscope
    Wisotzky, Eric L.
    Rosenthal, Jean-Claude
    Eisert, Peter
    Hilsmann, Anna
    Schmid, Falko
    Bauer, Michael
    Schneider, Armin
    Uecker, Florian C.
    [J]. 2019 26TH IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR), 2019, : 1477 - 1484