Audio-Driven Facial Animation with Deep Learning: A Survey

被引:0
|
作者
Jiang, Diqiong [1 ]
Chang, Jian [1 ]
You, Lihua [1 ]
Bian, Shaojun [2 ]
Kosk, Robert [1 ]
Maguire, Greg [3 ]
机构
[1] Bournemouth Univ, Natl Ctr Comp Animat, Poole BH12 5BB, England
[2] Buckinghamshire New Univ, Sch Creat & Digital Ind, High Wycombe HP11 2JZ, England
[3] Ulster Univ, Belfast Sch Art, Belfast BT15 1ED, North Ireland
基金
欧盟地平线“2020”;
关键词
deep learning; audio processing; talking head; face generation; AUDIOVISUAL CORPUS; SPEECH;
D O I
10.3390/info15110675
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on both audio-driven facial image animation and audio-driven facial mesh animation. These approaches employ deep learning to map audio inputs directly onto 3D facial meshes or 2D images, enabling the creation of highly realistic and synchronized animations. This survey also explores evaluation metrics, available datasets, and the challenges that remain, such as disentangling lip synchronization and emotions, generalization across speakers, and dataset limitations. Lastly, we discuss future directions, including multi-modal integration, personalized models, and facial attribute modification in animations, all of which are critical for the continued development and application of this technology.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Photorealistic Audio-driven Video Portraits
    Wen, Xin
    Wang, Miao
    Richardt, Christian
    Chen, Ze-Yin
    Hu, Shi-Min
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (12) : 3457 - 3466
  • [22] Audio-Driven Laughter Behavior Controller
    Ding, Yu
    Huang, Jing
    Pelachaud, Catherine
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2017, 8 (04) : 546 - 558
  • [23] Audio-Driven Emotional Video Portraits
    Ji, Xinya
    Zhou, Hang
    Wang, Kaisiyuan
    Wu, Wayne
    Loy, Chen Change
    Cao, Xun
    Xu, Feng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14075 - 14084
  • [24] SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
    Zhang, Wenxuan
    Cun, Xiaodong
    Wang, Xuan
    Zhang, Yong
    Shen, Xi
    Guo, Yu
    Shan, Ying
    Wang, Fei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 8652 - 8661
  • [25] Voice2Face: Audio-driven Facial and Tongue Rig Animations with cVAEs
    Aylagas, Monica Villanueva
    Leon, Hector Anadon
    Teye, Mattias
    Tollmar, Konrad
    COMPUTER GRAPHICS FORUM, 2022, 41 (08) : 255 - 265
  • [26] Video-Audio Driven Real-Time Facial Animation
    Liu, Yilong
    Xu, Feng
    Chai, Jinxiang
    Tong, Xin
    Wang, Lijuan
    Huo, Qiang
    ACM TRANSACTIONS ON GRAPHICS, 2015, 34 (06):
  • [27] Audio- and Gaze-driven Facial Animation of Codec Avatars
    Richard, Alexander
    Lea, Colin
    Ma, Shugao
    Gall, Juergen
    de la Torre, Fernando
    Sheikh, Yaser
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 41 - 50
  • [28] A HYBRID PHONEME BASED CLUSTERING APPROACH FOR AUDIO DRIVEN FACIAL ANIMATION
    Havell, Benjamin
    Rosin, Paul L.
    Sanei, Saeid
    Aubrey, Andrew
    Marshall, David
    Hicks, Yulia
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 2261 - 2264
  • [29] Audio-Driven Multimedia Content Authentication as a Service
    Vryzas, Nikolaos
    Katsaounidou, Anastasia
    Kotsakis, Rigas
    Dimoulas, Charalampos
    Kalliris, George
    146TH AES CONVENTION, 2019,
  • [30] Audio-Driven Talking Face Generation: A Review
    Liu, Shiguang
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2023, 71 (7-8): : 408 - 419