Video Summarization Leveraging Multimodal Information for Presentations

被引:0
|
作者
Liu, Hanchao [1 ]
Chen, Dapeng [1 ]
Li, Rongjun [1 ]
Xue, Wenyuan [1 ]
Peng, Wei [1 ]
机构
[1] Huawei Technol Co Ltd, IT Platform Chief Expert Off, Shenzhen, Peoples R China
来源
关键词
multimodal; video summarization;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This demonstration introduces a video summarization system, leveraging multimodal information to efficiently extract essential contents from presentations. In contrast to existing methods focusing primarily on daily life videos and solely utilizing visual information, our system extracts multimodal information, including speech, text, and visual information from videos of presentations. Specifically, the proposed system extracts crucial slide texts from key-frames as queries to filter speech transcripts. By piecing together the video clips corresponding to the filtered speech transcripts, our system outputs the final video summarizations. The evaluation on ICCV 2017 videos demonstrates the effectiveness of the proposed system compared with the lead-3 baseline.
引用
收藏
页码:5251 / 5252
页数:2
相关论文
共 50 条
  • [31] Multimodal emotional analysis through hierarchical video summarization and face tracking
    Thiruthuvanathan, Michael Moses
    Krishnan, Balachandran
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (25) : 35535 - 35554
  • [32] Video summarization via knowledge-aware multimodal deep networks
    Xie, Jiehang
    Chen, Xuanbai
    Zhao, Sicheng
    Lu, Shao-Ping
    KNOWLEDGE-BASED SYSTEMS, 2024, 293
  • [33] Meeting Extracts for Discussion Summarization Based on Multimodal Nonverbal Information
    Nihei, Fumio
    Nakano, Yukiko I.
    Takase, Yutaka
    ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 185 - 192
  • [34] Automatic video summarization by using color and utterance information
    Fujimura, K
    Honda, K
    Uehara, K
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : 49 - 52
  • [35] Video Summarization Using Genetic Algorithm and Information Theory
    Tabrizi, Zeinab Zeinalpour
    Bidgoli, Behrouz Minaei
    Fathi, Mahmud
    2009 14TH INTERNATIONAL COMPUTER CONFERENCE, 2009, : 157 - 162
  • [36] LEVERAGING LOCAL TEMPORAL INFORMATION FOR MULTIMODAL SCENE CLASSIFICATION
    Sahu, Saurabh
    Goyal, Palash
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1830 - 1834
  • [37] MSMO: Multimodal Summarization with Multimodal Output
    Zhu, Junnan
    Li, Haoran
    Liu, Tianshang
    Zhou, Yu
    Zhang, Jiajun
    Zong, Chengqing
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4154 - 4164
  • [38] Multimodal Summarization with Guidance of Multimodal Reference
    Zhu, Junnan
    Zhou, Yu
    Zhang, Jiajun
    Li, Haoran
    Zong, Chengqing
    Li, Changliang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9749 - 9756
  • [39] Information Graphic Summarization using a Collection of Multimodal Deep Neural Networks
    Kim, Edward
    Onweller, Connor
    McCoy, Kathleen E.
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10188 - 10195
  • [40] An Efficient Method for Video Summarization using Moving Object Information
    Salehin, Md. Musfequs
    Paul, Manoranjan
    2015 18TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2015, : 237 - 242