Automatic Text-based Clip Composition for Video News

被引:0
|
作者
Quandt, Dennis [1 ]
Altmeyer, Philipp [1 ]
Ruppel, Wolfgang [1 ]
Narroschke, Matthias [1 ]
机构
[1] RheinMain Univ Appl Sci, Wiesbaden, Germany
关键词
News Clip Editing; AI Video Editing; Computational Cinematography; Text-based Clip Sequencing;
D O I
10.1145/3665026.3665042
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
News broadcasters must produce engaging video clips quicker than ever to ensure their successful positioning in the market. This is due, in part, to the growing number of news sources and changes in media consumption amongst target audiences. This evolution has amplified the need to quickly produce news clips, a requirement that remains at odds with the traditionally manual and time-consuming video editing processes. Besides advances in automating video news production, current systems are yet to meet the sufficient automation level and quality standards required for professional news broadcasting. Addressing this gap, we propose a novel transformer-based framework for automatically composing news clips to streamline the editing process. Our framework is predicated on a vision-language feature embedding mechanism and a cross-attention transformer architecture designed to generate multi-shot news clips semantically coherent with the editorial text and stylistically consistent with professional editing benchmarks. Our framework composes news clips with a length of 2 minutes from source material ranging from 20 minutes to 2 hours in less than 5 minutes using a single GPU. In our user study, target groups with different experience levels rated the generated videos on a 6-point Likert scale. Users rated the news clips generated by our framework with an average score of 4.13 and the manually edited news clips with an average score of 4.58.
引用
收藏
页码:106 / 112
页数:7
相关论文
共 50 条
  • [41] Knowledge Graph-Enabled Text-Based Automatic Personality Prediction
    Ramezani, Majid
    Feizi-Derakhshi, Mohammad-Reza
    Balafar, Mohammad-Ali
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [42] Influence of Reading Errors on the Text-Based Automatic Evaluation of Pathologic Voices
    Haderlein, Tino
    Noeth, Elmar
    Maier, Andreas
    Schuster, Maria
    Rosanowski, Frank
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2008, 5246 : 325 - +
  • [43] An image and text-based multimodal model for detecting fake news in OSN's
    Uppada, Santosh Kumar
    Patel, Parth
    Sivaselvan, B.
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2023, 61 (02) : 367 - 393
  • [44] Integration of manual and automatic text categorization. A categorization workbench for text-based email and spam
    Sun, Q
    Schommer, C
    Lang, A
    KI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3238 : 156 - 167
  • [45] An image and text-based multimodal model for detecting fake news in OSN’s
    Santosh Kumar Uppada
    Parth Patel
    Sivaselvan B.
    Journal of Intelligent Information Systems, 2023, 61 : 367 - 393
  • [46] TeCM-CLIP: Text-Based Controllable Multi-attribute Face Image Manipulation
    Lou, Xudong
    Liu, Yiguang
    Li, Xuwei
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 71 - 87
  • [47] A Video is Worth a Thousand Words: Perceived Learning Value in Video versus Text-Based Cases
    Douglas, Anne
    Kimbaris, Grace
    Stein, Laura
    NEUROLOGY, 2021, 96 (15)
  • [48] Text-based interfaces and text-based bibliographic enhancements: Thinking beyond standard bibliographic information (and text)
    Wall, TB
    PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1996, 33 : 278 - 278
  • [49] Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
    Zhao, Wangbo
    Wang, Kai
    Chu, Xiangxiang
    Xue, Fuzhao
    Wang, Xinchao
    You, Yang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11727 - 11736
  • [50] FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
    Qi, Chenyang
    Cun, Xiaodong
    Zhang, Yong
    Lei, Chenyang
    Wang, Xintao
    Shan, Ying
    Chen, Qifeng
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15886 - 15896