SPACE : Speech-driven Portrait Animation with Controllable Expression

被引：0

作者：

Gururani, Siddharth ^{[1
]}

Mallya, Arun ^{[1
]}

Wang, Ting-Chun ^{[1
]}

Valle, Rafael ^{[1
]}

Liu, Ming-Yu ^{[1
]}

机构：

[1] NVIDIA, Santa Clara, CA 95051 USA

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.01912

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Animating portraits using speech has received growing attention in recent years, with various creative and practical use cases. An ideal generated video should have good lip sync with the audio, natural facial expressions and head motions, and high frame quality. In this work, we present SPACE, which uses speech and a single image to generate high-resolution, and expressive videos with realistic head pose, without requiring a driving video. It uses a multi-stage approach, combining the controllability of facial landmarks with the high-quality synthesis power of a pretrained face generator. SPACE also allows for the control of emotions and their intensities. Our method outperforms prior methods in objective metrics for image quality and facial motions and is strongly preferred by users in pair-wise comparisons. Please visit the project page to view the videos and to see more results: https://research.nvidia.com/labs/dir/space/.

引用

页码：20857 / 20866

页数：10

共 50 条

[1] Speech-driven animation with meaningful behaviors
Sadoughi, Najmeh
Busso, Carlos
[J]. SPEECH COMMUNICATION, 2019, 110 : 90 - 100
[2] Expressive speech-driven facial animation
Cao, Y
Tien, WC
Faloutsos, P
Pighin, F
[J]. ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (04): : 1283 - 1302
[3] Realistic Speech-Driven Facial Animation with GANs
Konstantinos Vougioukas
Stavros Petridis
Maja Pantic
[J]. International Journal of Computer Vision, 2020, 128 : 1398 - 1413
[4] Speech-driven facial animation with realistic dynamics
Gutierrez-Osuna, R
Kakumanu, PK
Esposito, A
Garcia, ON
Bojorquez, A
Castillo, JL
Rudomin, I
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2005, 7 (01) : 33 - 42
[5] Realistic Speech-Driven Facial Animation with GANs
Vougioukas, Konstantinos
Petridis, Stavros
Pantic, Maja
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (05) : 1398 - 1413
[6] Speech-driven facial animation using a hierarchical model
Cosker, DP
Marshall, AD
Rosin, PL
Hicks, YA
[J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2004, 151 (04): : 314 - 321
[7] Speech-Driven Facial Animation Using Manifold Relevance Determination
Dawood, Samia
Hicks, Yulia
Marshall, David
[J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 869 - 882
[8] SPEECH-DRIVEN FACIAL ANIMATION USING POLYNOMIAL FUSION OF FEATURES
Kefalas, Triantafyllos
Vougioukas, Konstantinos
Panagakis, Yannis
Petridis, Stavros
Kossaifi, Jean
Pantic, Maja
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3487 - 3491
[9] Towards Realistic Real Time Speech-Driven Facial Animation
Cerekovic, Aleksandra
Zoric, Goranka
Smid, Karlo
Pandzic, Igor S.
[J]. INTELLIGENT VIRTUAL AGENTS, PROCEEDINGS, 2008, 5208 : 476 - 478
[10] A comparison of acoustic coding models for speech-driven facial animation
Kakumanu, Praveen
Esposito, Anna
Garcia, Oscar N.
Gutierrez-Osuna, Ricardo
[J]. SPEECH COMMUNICATION, 2006, 48 (06) : 598 - 615

← 1 2 3 4 5 →