Iterative Text-Based Editing of Talking-Heads Using Neural Retargeting

被引：10

作者：

Yao, Xinwei ^{[1
]}

Fried, Ohad ^{[2
,3
]}

Fatahalian, Kayvon ^{[1
]}

Agrawala, Maneesh ^{[1
]}

机构：

[1] Stanford Univ, Dept Comp Sci, 353 Jane Stanford Way, Stanford, CA 94305 USA

[2] Interdisciplinary Ctr Herzliya, Herzliyya, Israel

[3] IDC Herzliya, Efi Arazi Sch Comp Sci, IL-46150 Herzliyya, Israel

来源：

ACM TRANSACTIONS ON GRAPHICS | 2021年 / 40卷 / 03期

基金：

美国国家科学基金会;

关键词：

Text-based video editing; talking-heads; phonemes; retargeting;

D O I：

10.1145/3449063

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We present a text-based tool for editing talking-head video that enables an iterative editing workflow. On each iteration users can edit the wording of the speech, further refine mouth motions if necessary to reduce artifacts, and manipulate non-verbal aspects of the performance by inserting mouth gestures (e.g., a smile) or changing the overall performance style (e.g., energetic, mumble). Our tool requires only 2 to 3 minutes of the target actor video and it synthesizes the video for each iteration in about 40 seconds, allowing users to quickly explore many editing possibilities as they iterate. Our approach is based on two key ideas. (1) We develop a fast phoneme search algorithm that can quickly identify phoneme-level subsequences of the source repository video that best match a desired edit. This enables our fast iteration loop. (2) We leverage a large repository of video of a source actor and develop a new self-supervised neural retargeting technique for transferring the mouth motions of the source actor to the target actor. This allows us to work with relatively short target actor videos, making our approach applicable inmany real-world editing scenarios. Finally, our, refinement and performance controls give users the ability to further fine-tune the synthesized results.

引用

页数：14

共 50 条

[1] Text-based Editing of Talking-head Video
Fried, Ohad
Tewari, Ayush
Zollhofer, Michael
Finkelstein, Adam
Shechtman, Eli
Goldman, Dan B.
Genova, Kyle
Jin, Zeyu
Theobalt, Christian
Agrawala, Maneesh
[J]. ACM TRANSACTIONS ON GRAPHICS, 2019, 38 (04):
[2] Talking-heads attention-based knowledge representation for link prediction
Wang, Shirui
Zhou, Wen'an
Zhou, Qiang
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 74
[3] Non-Linear Editing of Text-Based Screencasts
Park, Jungkook
Park, Yeong Hoon
Oh, Alice
[J]. UIST 2018: PROCEEDINGS OF THE 31ST ANNUAL ACM SYMPOSIUM ON USER INTERFACE SOFTWARE AND TECHNOLOGY, 2018, : 403 - 410
[4] TrojanEdit: Backdooring Text-Based Image Editing Models
Guo, Ji
Chen, Peihong
Jiang, Wenbo
Lu, Guoming
[J]. arXiv,
[5] Text-Based Spam Tweets Detection Using Neural Networks
Mardi, Vanyashree
Kini, Anvaya
Sukanya, V. M.
Rachana, S.
[J]. ADVANCES IN COMPUTING AND INTELLIGENT SYSTEMS, ICACM 2019, 2020, : 401 - 408
[6] Imagic: Text-Based Real Image Editing with Diffusion Models
Kawar, Bahjat
Zada, Shiran
Lang, Oran
Tov, Omer
Chang, Huiwen
Dekel, Tali
Mosseri, Inbar
Irani, Michal
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6007 - 6017
[7] Multi-modal molecule structure–text model for text-based retrieval and editing
Shengchao Liu
Weili Nie
Chengpeng Wang
Jiarui Lu
Zhuoran Qiao
Ling Liu
Jian Tang
Chaowei Xiao
Animashree Anandkumar
[J]. Nature Machine Intelligence, 2023, 5 : 1447 - 1457
[8] A Frame of Mind: Frame-based vs. Text-based Editing
Brown, Neil
Kyfonidis, Charalampos
Weill-Tessier, Pierre
Becker, Brett
Dillane, Joe
Kolling, Michael
[J]. UKICER '21: PROCEEDINGS OF THE 2021 UNITED KINGDOM AND IRELAND COMPUTING EDUCATION RESEARCH CONFERENCE, 2021,
[9] CONTEXT-AWARE PROSODY CORRECTION FOR TEXT-BASED SPEECH EDITING
Morrison, Max
Rencker, Lucas
Jin, Zeyu
Bryan, Nicholas J.
Caceres, Juan-Pablo
Pardo, Bryan
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7038 - 7042
[10] Neural and Linguistic Considerations for Assessing Moral Intuitions Using Text-Based Stimuli
Bretl, Brandon L.
[J]. JOURNAL OF PSYCHOLOGY, 2021, 155 (01): : 90 - 114

← 1 2 3 4 5 →