Personalized Audio-Driven 3D Facial Animation via Style-Content Disentanglement

被引:2
|
作者
Chai, Yujin [1 ]
Shao, Tianjia [1 ]
Weng, Yanlin [1 ]
Zhou, Kun [1 ]
机构
[1] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou 310058, Zhejiang, Peoples R China
关键词
Audio-driven animation; facial animation; style learning; style-content disentanglement; facial motion decomposition; PLUS PLUS;
D O I
10.1109/TVCG.2022.3230541
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a learning-based approach for generating 3D facial animations with the motion style of a specific subject from arbitrary audio inputs. The subject style is learned from a video clip (1-2 minutes) either downloaded from the Internet or captured through an ordinary camera. Traditional methods often require many hours of the subject's video to learn a robust audio-driven model and are thus unsuitable for this task. Recent research efforts aim to train a model from video collections of a few subjects but ignore the discrimination between the subject style and underlying speech content within facial motions, leading to inaccurate style or articulation. To solve the problem, we propose a novel framework that disentangles subject-specific style and speech content from facial motions. The disentanglement is enabled by two novel training mechanisms. One is two-pass style swapping between two random subjects, and the other is joint training of the decomposition network and audio-to-motion network with a shared decoder. After training, the disentangled style is combined with arbitrary audio inputs to generate stylized audio-driven 3D facial animations. Compared with start-of-the-art methods, our approach achieves better results qualitatively and quantitatively, especially in difficult cases like bilabial plosive and bilabial nasal phonemes.
引用
收藏
页码:1803 / 1820
页数:18
相关论文
共 50 条
  • [1] EmoFace: Audio-driven Emotional 3D Face Animation
    Liu, Chang
    Lin, Qunfen
    Zeng, Zijiao
    Pan, Ye
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 387 - 397
  • [2] Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
    Fu, Hui
    Wang, Zeqing
    Gong, Ke
    Wang, Keze
    Chen, Tianshui
    Li, Haojie
    Zeng, Haifeng
    Kang, Wenxiong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1770 - 1777
  • [3] UniTalker: Scaling up Audio-Driven 3D Facial Animation Through A Unified Model
    Fan, Xiangyu
    Li, Jiaqi
    Lin, Zhiqian
    Xiao, Weiye
    Yang, Lei
    COMPUTER VISION - ECCV 2024, PT XLI, 2025, 15099 : 204 - 221
  • [4] Emotion-Aware Audio-Driven Face Animation via Contrastive Feature Disentanglement
    Ren, Xin
    Luo, Juan
    Zhong, Xionghu
    Cai, Minjie
    INTERSPEECH 2023, 2023, : 2728 - 2732
  • [5] Audio-Driven Facial Animation with Deep Learning: A Survey
    Jiang, Diqiong
    Chang, Jian
    You, Lihua
    Bian, Shaojun
    Kosk, Robert
    Maguire, Greg
    INFORMATION, 2024, 15 (11)
  • [6] Multi-Task Audio-Driven Facial Animation
    Kim, Youngsoo
    An, Shounan
    Jo, Youngbak
    Park, Seungje
    Kang, Shindong
    Oh, Insoo
    Kim, Duke Donghyun
    SIGGRAPH '19 - ACM SIGGRAPH 2019 POSTERS, 2019,
  • [7] A Comparative Study of Four 3D Facial Animation Methods: Skeleton, Blendshape, Audio-Driven, and Vision-Based Capture
    Wei, Mingzhu
    Adamo, Nicoletta
    Giri, Nandhini
    Chen, Yingjie
    ARTSIT, INTERACTIVITY AND GAME CREATION, ARTSIT 2022, 2023, 479 : 36 - 50
  • [8] Imitator: Personalized Speech-driven 3D Facial Animation
    Thambiraja, Balamurugan
    Habibie, Ikhsanul
    Aliakbarian, Sadegh
    Cosker, Darren
    Theobalt, Christian
    Thies, Justus
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20564 - 20574
  • [9] Audio-Driven Lips and Expression on 3D Human Face
    Ma, Le
    Ma, Zhihao
    Meng, Weiliang
    Xu, Shibiao
    Zhang, Xiaopeng
    ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT II, 2024, 14496 : 15 - 26
  • [10] Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
    Karras, Tero
    Aila, Timo
    Laine, Samuli
    Herva, Antti
    Lehtinen, Jaakko
    ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):