Automatic Text-based Clip Composition for Video News

被引:0
|
作者
Quandt, Dennis [1 ]
Altmeyer, Philipp [1 ]
Ruppel, Wolfgang [1 ]
Narroschke, Matthias [1 ]
机构
[1] RheinMain Univ Appl Sci, Wiesbaden, Germany
关键词
News Clip Editing; AI Video Editing; Computational Cinematography; Text-based Clip Sequencing;
D O I
10.1145/3665026.3665042
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
News broadcasters must produce engaging video clips quicker than ever to ensure their successful positioning in the market. This is due, in part, to the growing number of news sources and changes in media consumption amongst target audiences. This evolution has amplified the need to quickly produce news clips, a requirement that remains at odds with the traditionally manual and time-consuming video editing processes. Besides advances in automating video news production, current systems are yet to meet the sufficient automation level and quality standards required for professional news broadcasting. Addressing this gap, we propose a novel transformer-based framework for automatically composing news clips to streamline the editing process. Our framework is predicated on a vision-language feature embedding mechanism and a cross-attention transformer architecture designed to generate multi-shot news clips semantically coherent with the editorial text and stylistically consistent with professional editing benchmarks. Our framework composes news clips with a length of 2 minutes from source material ranging from 20 minutes to 2 hours in less than 5 minutes using a single GPU. In our user study, target groups with different experience levels rated the generated videos on a 6-point Likert scale. Users rated the news clips generated by our framework with an average score of 4.13 and the manually edited news clips with an average score of 4.58.
引用
收藏
页码:106 / 112
页数:7
相关论文
共 50 条
  • [21] Exploring Effective Interactive Text-Based Video Search in vitrivr
    Sauter, Loris
    Gasser, Ralph
    Heller, Silvan
    Rossetto, Luca
    Saladin, Colin
    Spiess, Florian
    Schuldt, Heiko
    MULTIMEDIA MODELING, MMM 2023, PT I, 2023, 13833 : 646 - 651
  • [22] Actor and Action Modular Network for Text-Based Video Segmentation
    Yang, Jianhua
    Huang, Yan
    Niu, Kai
    Huang, Linjiang
    Ma, Zhanyu
    Wang, Liang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4474 - 4489
  • [23] Video and Text-Based Affect Analysis of Children in Play Therapy
    Doyran, Metehan
    Turkmen, Batikan
    Oktay, Eda Aydin
    Halfon, Sibel
    Salah, Albert Ali
    ICMI'19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2019, : 26 - 34
  • [24] Actor and Action Modular Network for Text-Based Video Segmentation
    Yang, Jianhua
    Huang, Yan
    Niu, Kai
    Huang, Linjiang
    Ma, Zhanyu
    Wang, Liang
    IEEE Transactions on Image Processing, 2022, 31 : 4474 - 4489
  • [25] Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation
    Haderlein, Tino
    Moers, Cornelia
    Moebius, Bernd
    Noeth, Elmar
    TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 573 - 580
  • [26] Video and text-based reflections of practical experiences at preservice teachers
    Kucholl, Denise
    Lazarides, Rebecca
    ZEITSCHRIFT FUR ERZIEHUNGSWISSENSCHAFT, 2021, 24 (04): : 985 - 1006
  • [27] Novice-Friendly Text-based Video Search with vitrivr
    Sauter, Loris
    Schuldt, Heiko
    Waltenspul, Raphael
    Rossetto, Luca
    20TH INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING, CBMI 2023, 2023, : 163 - 167
  • [28] Text-Based Semantic Video Annotation for Interactive Cooking Videos
    Oh, Kyeong-Jin
    Hong, Myung-Duk
    Yoon, Ui-Nyoung
    Jo, Geun-Sik
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT I, 2015, 9329 : 236 - 244
  • [29] Automatic story segmentation of news video based on audio-visual features and text information
    Wang, C
    Wang, Y
    Liu, HY
    He, YX
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 3008 - 3011
  • [30] Text2Video: Automatic Video Generation Based on Text Scripts
    Yu, Yipeng
    Tu, Zirui
    Lu, Longyu
    Chen, Xiao
    Zhan, Hui
    Sun, Zixun
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2753 - 2755