Editing like Humans: A Contextual, Multimodal Framework for Automated Video Editing

被引：3

作者：

Koorathota, Sharath ^{[1
,2
]}

Adelman, Patrick ^{[2
]}

Cotton, Kelly ^{[3
]}

Sajda, Paul ^{[1
]}

机构：

[1] Columbia Univ, Dept Biomed Engn, New York, NY 10027 USA

[2] Fovea Inc, New York, NY 10001 USA

[3] CUNY, Grad Ctr, Dept Psychol, New York, NY USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021 | 2021年

关键词：

D O I：

10.1109/CVPRW53098.2021.00186

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose an automated video editing model, which we term contextual and multimodal video editing (CMVE). The model leverages visual and textual metadata describing videos, integrating essential information from both modalities, and uses a learned editing style from a single example video to coherently combine clips. The editing model is useful for tasks such as generating news clip montages and highlight reels given a text query that describes the video storyline. The model exploits the perceptual similarity between video frames, objects in videos and text descriptions to emulate coherent video editing. Amazon Mechanical Turk participants made judgements comparing CMVE to expert human editing. Experimental results showed no significant difference in the CMVE vs human edited video in terms of matching the text query and the level of interest each generates, suggesting CMVE is able to effectively integrate semantic information across visual and textual modalities and create perceptually coherent quality videos typical of human video editors. We publicly release an online demonstration of our method.

引用

页码：1701 / 1709

页数：9

共 50 条

[1] Automatic Video Editing for Multimodal Meetings
Kubicek, Radek
Zak, Pavel
Zemcik, Pavel
Herout, Adam
COMPUTER VISION AND GRAPHICS, 2009, 5337 : 260 - 269
[2] Joint Attention for Automated Video Editing
Wu, Hui-Yin
Santarra, Trevor
Leece, Michael
Vargas, Rolando
Jhala, Arnav
PROCEEDINGS OF THE 2020 ACM INTERNATIONAL CONFERENCE ON INTERACTIVE MEDIA EXPERIENCES, IMX 2020, 2020, : 55 - 64
[3] Automated video tape editing system
SHIMADA R
AKATSUKA S
Toshiba Review, 1971, (62): : 5 - 10
[4] An efficient framework for quantum video and video editing
Wei, Zhanhong
Sun, Wentao
Zhu, Shangchao
Han, Mengdi
Yin, Huijuan
INTERNATIONAL JOURNAL OF QUANTUM INFORMATION, 2023, 21 (05)
[5] Editing out video editing
Davis, M
IEEE MULTIMEDIA, 2003, 10 (02) : 54 - 64
[6] Automated Video Editing for Aesthetic Quality Improvement
Choi, Jun-Ho
Lee, Jong-Seok
MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1003 - 1006
[7] AN AUTOMATED VIDEO-TAPE EDITING SYSTEM
CAMPBELL, KD
JOURNAL OF THE SOCIETY OF MOTION PICTURE TELEVISION ENGINEERS, 1970, 79 (03): : 191 - &
[8] Generative Methods for Automated Music Video Editing
Stefan, Julia
ENTERTAINMENT COMPUTING - ICEC 2014, 2014, 8770 : 226 - 228
[9] Ubiquitous interactive video editing via multimodal annotations
Pimentel, Maria da Graca C.
Goularte, Rudinei
Cattelan, Renan G.
Santos, Felipe S.
Teixeira, Cesar
CHANGING TELEVISION ENVIRONMENTS, PROCEEDINGS, 2008, 5066 : 72 - +
[10] The application of video semantics and theme representation in automated video editing
Nack, F
Parkes, A
MULTIMEDIA TOOLS AND APPLICATIONS, 1997, 4 (01) : 57 - 83

← 1 2 3 4 5 →