Video Summarization Leveraging Multimodal Information for Presentations

被引:0
|
作者
Liu, Hanchao [1 ]
Chen, Dapeng [1 ]
Li, Rongjun [1 ]
Xue, Wenyuan [1 ]
Peng, Wei [1 ]
机构
[1] Huawei Technol Co Ltd, IT Platform Chief Expert Off, Shenzhen, Peoples R China
来源
关键词
multimodal; video summarization;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This demonstration introduces a video summarization system, leveraging multimodal information to efficiently extract essential contents from presentations. In contrast to existing methods focusing primarily on daily life videos and solely utilizing visual information, our system extracts multimodal information, including speech, text, and visual information from videos of presentations. Specifically, the proposed system extracts crucial slide texts from key-frames as queries to filter speech transcripts. By piecing together the video clips corresponding to the filtered speech transcripts, our system outputs the final video summarizations. The evaluation on ICCV 2017 videos demonstrates the effectiveness of the proposed system compared with the lead-3 baseline.
引用
收藏
页码:5251 / 5252
页数:2
相关论文
共 50 条
  • [41] On Multimodal Microblog Summarization
    Saini, Naveen
    Saha, Sriparna
    Bhattacharyya, Pushpak
    Mrinal, Shubhankar
    Mishra, Santosh Kumar
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2022, 9 (05) : 1317 - 1329
  • [42] Multimodal information fusion for video concept detection
    Wu, Y
    Lin, CK
    Chang, EY
    Smith, JR
    ICIP: 2004 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1- 5, 2004, : 2391 - 2394
  • [43] Multimodal Information Fusion for Semantic Video Analysis
    Gulen, Elvan
    Yilmaz, Turgay
    Yazici, Adnan
    INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2012, 3 (04): : 52 - 74
  • [44] Perceptual Video Summarization-A New Framework for Video Summarization
    Thomas, Sinnu Susan
    Gupta, Sumana
    Subramanian, Venkatesh K.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (08) : 1790 - 1802
  • [45] COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization
    Athanasia Zlatintsi
    Petros Koutras
    Georgios Evangelopoulos
    Nikolaos Malandrakis
    Niki Efthymiou
    Katerina Pastra
    Alexandros Potamianos
    Petros Maragos
    EURASIP Journal on Image and Video Processing, 2017
  • [46] COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization
    Zlatintsi, Athanasia
    Koutras, Petros
    Evangelopoulos, Georgios
    Malandrakis, Nikolaos
    Efthymiou, Niki
    Pastra, Katerina
    Potamianos, Alexandros
    Maragos, Petros
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [47] Leveraging ensemble machine learning and multimodal video complexity for better prediction of video difficulty in second language
    Alghamdi, Emad A.
    INTERACTIVE LEARNING ENVIRONMENTS, 2024,
  • [48] Exploring the Trade-Off within Visual Information for MultiModal Sentence Summarization
    Yuan, Minghuan
    Cui, Shiyao
    Zhang, Xinghua
    Wang, Shicheng
    Xu, Hongbo
    Liu, Tingwen
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2006 - 2017
  • [49] Semantic Representation and Attention Alignment for Graph Information Bottleneck in Video Summarization
    Zhong, Rui
    Wang, Rui
    Yao, Wenjin
    Hu, Min
    Dong, Shi
    Munteanu, Adrian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4170 - 4184
  • [50] Robust shot boundary detection and video summarization based on motion information
    Zhang J.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2010, 22 (06): : 1023 - 1032