Automatic Text-based Clip Composition for Video News

被引：0

作者：

Quandt, Dennis ^{[1
]}

Altmeyer, Philipp ^{[1
]}

Ruppel, Wolfgang ^{[1
]}

Narroschke, Matthias ^{[1
]}

机构：

[1] RheinMain Univ Appl Sci, Wiesbaden, Germany

来源：

9TH INTERNATIONAL CONFERENCE ON MULTIMEDIA AND IMAGE PROCESSING, ICMIP 2024 | 2024年

关键词：

News Clip Editing; AI Video Editing; Computational Cinematography; Text-based Clip Sequencing;

D O I：

10.1145/3665026.3665042

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

News broadcasters must produce engaging video clips quicker than ever to ensure their successful positioning in the market. This is due, in part, to the growing number of news sources and changes in media consumption amongst target audiences. This evolution has amplified the need to quickly produce news clips, a requirement that remains at odds with the traditionally manual and time-consuming video editing processes. Besides advances in automating video news production, current systems are yet to meet the sufficient automation level and quality standards required for professional news broadcasting. Addressing this gap, we propose a novel transformer-based framework for automatically composing news clips to streamline the editing process. Our framework is predicated on a vision-language feature embedding mechanism and a cross-attention transformer architecture designed to generate multi-shot news clips semantically coherent with the editorial text and stylistically consistent with professional editing benchmarks. Our framework composes news clips with a length of 2 minutes from source material ranging from 20 minutes to 2 hours in less than 5 minutes using a single GPU. In our user study, target groups with different experience levels rated the generated videos on a 6-point Likert scale. Users rated the news clips generated by our framework with an average score of 4.13 and the manually edited news clips with an average score of 4.58.

引用

页码：106 / 112

页数：7

共 50 条

[1] Exploring automatic query refinement for text-based video retrieval
Volkmer, Timo
Natsev, Apostol
2006 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO - ICME 2006, VOLS 1-5, PROCEEDINGS, 2006, : 765 - 768
[2] News Video Clip Retrieval Based on Topic Caption Text and Audio Information
Zhao Yaqin
Zheng Jiaqiang
Zhou Hongping
PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL IV, 2009, : 477 - 481
[3] An Empirical Study of CLIP for Text-Based Person Search
Cao, Min
Bai, Yang
Zeng, Ziyin
Ye, Mang
Zhang, Min
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 1, 2024, : 465 - 473
[4] A conceptual framework for automatic text-based indexing and retrieval in digital video collections
Belkhatir, Mohammed
Charhad, Mbarek
DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2007, 4653 : 392 - +
[5] Text-based search of TV news stories
Mohan, R
MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS, 1996, 2916 : 2 - 13
[6] Controllable Video Generation With Text-Based Instructions
Koksal, Ali
Ak, Kenan E.
Sun, Ying
Rajan, Deepu
Lim, Joo Hwee
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 190 - 201
[7] Text-Based Localization of Moments in a Video Corpus
Paul, Sudipta
Mithun, Niluthpol Chowdhury
Roy-Chowdhury, Amit K.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 8886 - 8899
[8] Linguistic Hallucination for Text-Based Video Retrieval
Fang S.
Dang T.
Wang S.
Huang Q.
IEEE Trans Circuits Syst Video Technol, 2024, 10 (9692-9705): : 1 - 1
[9] Text-based automatic personality prediction: a bibliographic review
Feizi-Derakhshi, Ali-Reza
Feizi-Derakhshi, Mohammad-Reza
Ramezani, Majid
Nikzad-Khasmakhi, Narjes
Asgari-Chenaghlu, Meysam
Akan, Taymaz
Ranjbar-Khadivi, Mehrdad
Zafarni-Moattar, Elnaz
Jahanbakhsh-Naghadeh, Zoleikha
JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2022, 5 (02): : 1555 - 1593
[10] Text-based automatic personality prediction: a bibliographic review
Ali-Reza Feizi-Derakhshi
Mohammad-Reza Feizi-Derakhshi
Majid Ramezani
Narjes Nikzad-Khasmakhi
Meysam Asgari-Chenaghlu
Taymaz Akan
Mehrdad Ranjbar-Khadivi
Elnaz Zafarni-Moattar
Zoleikha Jahanbakhsh-Naghadeh
Journal of Computational Social Science, 2022, 5 : 1555 - 1593

← 1 2 3 4 5 →