Automatic Text-based Clip Composition for Video News

被引：0

作者：

Quandt, Dennis ^{[1
]}

Altmeyer, Philipp ^{[1
]}

Ruppel, Wolfgang ^{[1
]}

Narroschke, Matthias ^{[1
]}

机构：

[1] RheinMain Univ Appl Sci, Wiesbaden, Germany

来源：

9TH INTERNATIONAL CONFERENCE ON MULTIMEDIA AND IMAGE PROCESSING, ICMIP 2024 | 2024年

关键词：

News Clip Editing; AI Video Editing; Computational Cinematography; Text-based Clip Sequencing;

D O I：

10.1145/3665026.3665042

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

News broadcasters must produce engaging video clips quicker than ever to ensure their successful positioning in the market. This is due, in part, to the growing number of news sources and changes in media consumption amongst target audiences. This evolution has amplified the need to quickly produce news clips, a requirement that remains at odds with the traditionally manual and time-consuming video editing processes. Besides advances in automating video news production, current systems are yet to meet the sufficient automation level and quality standards required for professional news broadcasting. Addressing this gap, we propose a novel transformer-based framework for automatically composing news clips to streamline the editing process. Our framework is predicated on a vision-language feature embedding mechanism and a cross-attention transformer architecture designed to generate multi-shot news clips semantically coherent with the editorial text and stylistically consistent with professional editing benchmarks. Our framework composes news clips with a length of 2 minutes from source material ranging from 20 minutes to 2 hours in less than 5 minutes using a single GPU. In our user study, target groups with different experience levels rated the generated videos on a 6-point Likert scale. Users rated the news clips generated by our framework with an average score of 4.13 and the manually edited news clips with an average score of 4.58.

引用

页码：106 / 112

页数：7

共 50 条

[21] Exploring Effective Interactive Text-Based Video Search in vitrivr
Sauter, Loris
Gasser, Ralph
Heller, Silvan
Rossetto, Luca
Saladin, Colin
Spiess, Florian
Schuldt, Heiko
MULTIMEDIA MODELING, MMM 2023, PT I, 2023, 13833 : 646 - 651
[22] Actor and Action Modular Network for Text-Based Video Segmentation
Yang, Jianhua
Huang, Yan
Niu, Kai
Huang, Linjiang
Ma, Zhanyu
Wang, Liang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4474 - 4489
[23] Video and Text-Based Affect Analysis of Children in Play Therapy
Doyran, Metehan
Turkmen, Batikan
Oktay, Eda Aydin
Halfon, Sibel
Salah, Albert Ali
ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 26 - 34
[24] Actor and Action Modular Network for Text-Based Video Segmentation
Yang, Jianhua
Huang, Yan
Niu, Kai
Huang, Linjiang
Ma, Zhanyu
Wang, Liang
IEEE Transactions on Image Processing, 2022, 31 : 4474 - 4489
[25] Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation
Haderlein, Tino
Moers, Cornelia
Moebius, Bernd
Noeth, Elmar
TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 573 - 580
[26] Video and text-based reflections of practical experiences at preservice teachers
Kucholl, Denise
Lazarides, Rebecca
ZEITSCHRIFT FUR ERZIEHUNGSWISSENSCHAFT, 2021, 24 (04): : 985 - 1006
[27] Novice-Friendly Text-based Video Search with vitrivr
Sauter, Loris
Schuldt, Heiko
Waltenspul, Raphael
Rossetto, Luca
20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 163 - 167
[28] Text-Based Semantic Video Annotation for Interactive Cooking Videos
Oh, Kyeong-Jin
Hong, Myung-Duk
Yoon, Ui-Nyoung
Jo, Geun-Sik
COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 236 - 244
[29] Automatic story segmentation of news video based on audio-visual features and text information
Wang, C
Wang, Y
Liu, HY
He, YX
2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 3008 - 3011
[30] Text2Video: Automatic Video Generation Based on Text Scripts
Yu, Yipeng
Tu, Zirui
Lu, Longyu
Chen, Xiao
Zhan, Hui
Sun, Zixun
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2753 - 2755

← 1 2 3 4 5 →