Talking Head Generation Based on 3D Morphable Facial Model

被引:1
|
作者
Shen, Hsin-Yu [1 ]
Tsai, Wen-Jiin [1 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Dept Comp Sci, Hsinchu, Taiwan
关键词
talking-head generation; 3DMM; image-to-image translation; self-attention; deep learning;
D O I
10.1109/PCS60826.2024.10566437
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a framework for one-shot talking-head video generation which takes a single person image and audio clips as input and synthesizes photo-realistic videos with natural head-poses and lip motion synced to the driving audio. The main idea behind this framework is to use 3D Morphable Model (3DMM) parameters as intermediate representation in generating the videos. We design an Expression Predictor and a Head Pose Predictor to predict facial expression and head-pose parameters from audio, respectively, and adopt a 3DMM model to extract identity and texture parameters from the reference image. With these parameters, facial images are rendered as an auxiliary to guide video generation. Compared to widely used facial landmarks, 3DMM parameters are more powerful in representing facial details. Experimental results show that our method can generate realistic talking-head videos and outperform many state-of-the-art methods.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Audio-driven Talking Head Generation with Transformer and 3D Morphable Model
    Huang, Ricong
    Zhong, Weizhi
    Li, Guanbin
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7035 - 7039
  • [2] 3D facial caricaturing based on Sparse Morphable Model
    Shu, Guang
    Yao, Li-Xiu
    Yang, Xiao-Chao
    Zuo, Xin
    Yang, Jie
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2010, 38 (08): : 1798 - 1802
  • [3] Combination of facial movements on a 3D talking head
    Bui, T
    Heylen, D
    Nijholt, A
    COMPUTER GRAPHICS INTERNATIONAL, PROCEEDINGS, 2004, : 284 - 291
  • [4] MPEG-4 compatible 3D facial animation based on morphable model
    Yin, BC
    Wang, CZ
    Shi, Q
    Sun, YF
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 4936 - 4941
  • [5] Towards a Complete 3D Morphable Model of the Human Head
    Ploumpis, Stylianos
    Ververas, Evangelos
    O'Sullivan, Eimear
    Moschoglou, Stylianos
    Wang, Haoyang
    Pears, Nick
    Smith, William A. P.
    Gecer, Baris
    Zafeiriou, Stefanos
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (11) : 4142 - 4160
  • [6] Improved 3D Morphable Model for Facial Action Unit Synthesis
    Wang, Minghui
    Liu, Zhilei
    IMAGE AND GRAPHICS (ICIG 2021), PT III, 2021, 12890 : 94 - 105
  • [7] Human facial expression recognition using a 3D morphable model
    Ramanathan, S.
    Kassim, Ashraf
    Venkatesh, Y. V.
    Wah, Wu Sin
    2006 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP 2006, PROCEEDINGS, 2006, : 661 - +
  • [8] Animated Talking Head with Personalized 3D Head Model
    Jörn Ostermann
    Lawrence S. Chen
    Thomas S. Huang
    Journal of VLSI signal processing systems for signal, image and video technology, 1998, 20 : 97 - 105
  • [9] Animated talking head with personalized 3D head model
    Ostermann, J
    Chen, LS
    Huang, TS
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1998, 20 (1-2): : 97 - 105
  • [10] Animated talking head with personalized 3D head model
    AT&T Lab - Research, Red Bank, United States
    J VLSI Signal Process Syst Signal Image Video Technol, 1-2 (97-105):