Video Summarization Leveraging Multimodal Information for Presentations

被引：0

作者：

Liu, Hanchao ^{[1
]}

Chen, Dapeng ^{[1
]}

Li, Rongjun ^{[1
]}

Xue, Wenyuan ^{[1
]}

Peng, Wei ^{[1
]}

机构：

[1] Huawei Technol Co Ltd, IT Platform Chief Expert Off, Shenzhen, Peoples R China

来源：

INTERSPEECH 2023 | 2023年

关键词：

multimodal; video summarization;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This demonstration introduces a video summarization system, leveraging multimodal information to efficiently extract essential contents from presentations. In contrast to existing methods focusing primarily on daily life videos and solely utilizing visual information, our system extracts multimodal information, including speech, text, and visual information from videos of presentations. Specifically, the proposed system extracts crucial slide texts from key-frames as queries to filter speech transcripts. By piecing together the video clips corresponding to the filtered speech transcripts, our system outputs the final video summarizations. The evaluation on ICCV 2017 videos demonstrates the effectiveness of the proposed system compared with the lead-3 baseline.

引用

页码：5251 / 5252

页数：2

共 50 条

[1] Leveraging multimodal content for podcast summarization
Vaiani, Lorenzo
La Quatra, Moreno
Cagliero, Luca
Garza, Paolo
37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 863 - 870
[2] Leveraging multimodal information for event summarization and concept-level sentiment analysis
Shah, Rajiv Ratn
Yu, Yi
Verma, Akshay
Tang, Suhua
Shaikh, Anwar Dilawar
Zimmermann, Roger
KNOWLEDGE-BASED SYSTEMS, 2016, 108 : 102 - 109
[3] Video Summarization Based on Multimodal Features
Zhang, Yu
Liu, Ju
Liu, Xiaoxi
Gao, Xuesong
INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2020, 11 (04): : 60 - 76
[4] Leveraging the Information Contained in Theory Presentations
Carette, Jacques
Farmer, William M.
Sharoda, Yasmine
INTELLIGENT COMPUTER MATHEMATICS, CICM 2020, 2020, 12236 : 55 - 70
[5] Auto-summarization of audio-video presentations
He, LW
Sanocki, E
Gupta, A
Grudin, J
ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 489 - 498
[6] A MULTIMODAL APPROACH FOR AUTOMATIC CRICKET VIDEO SUMMARIZATION
Bhalla, Aman
Ahuja, Arpit
Pant, Pradeep
Mittal, Ankush
2019 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2019, : 146 - 150
[7] Hierarchical Multimodal Attention for Deep Video Summarization
Sanabria, Melissa
Precioso, Frederic
Menguy, Thomas
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7977 - 7984
[8] Leveraging Information Bottleneck for Scientific Document Summarization
Ju, Jiaxin
Liu, Ming
Koh, Huan Yee
Jin, Yuan
Du, Lan
Pan, Shirui
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4091 - 4098
[9] Interactive System for Video Summarization Based on Multimodal Fusion
Zheng Li
Xiaobing Du
Cuixia Ma
Yanfeng Li
Hongan Wang
JournalofBeijingInstituteofTechnology, 2019, 28 (01) : 27 - 34
[10] Multimodal Local Feature Enhancement Network for Video Summarization
Li, Zhaoyun
Ren, Xiwei
Du, Fengyi
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VI, 2024, 14430 : 158 - 169

← 1 2 3 4 5 →