Audio-Driven Facial Animation with Deep Learning: A Survey

被引:0
|
作者
Jiang, Diqiong [1 ]
Chang, Jian [1 ]
You, Lihua [1 ]
Bian, Shaojun [2 ]
Kosk, Robert [1 ]
Maguire, Greg [3 ]
机构
[1] Bournemouth Univ, Natl Ctr Comp Animat, Poole BH12 5BB, England
[2] Buckinghamshire New Univ, Sch Creat & Digital Ind, High Wycombe HP11 2JZ, England
[3] Ulster Univ, Belfast Sch Art, Belfast BT15 1ED, North Ireland
基金
欧盟地平线“2020”;
关键词
deep learning; audio processing; talking head; face generation; AUDIOVISUAL CORPUS; SPEECH;
D O I
10.3390/info15110675
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial expressions and lip movements synchronized with a given audio input. This survey provides a comprehensive review of deep learning techniques applied to audio-driven facial animation, with a focus on both audio-driven facial image animation and audio-driven facial mesh animation. These approaches employ deep learning to map audio inputs directly onto 3D facial meshes or 2D images, enabling the creation of highly realistic and synchronized animations. This survey also explores evaluation metrics, available datasets, and the challenges that remain, such as disentangling lip synchronization and emotions, generalization across speakers, and dataset limitations. Lastly, we discuss future directions, including multi-modal integration, personalized models, and facial attribute modification in animations, all of which are critical for the continued development and application of this technology.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Audio-Driven Talking Video Frame Restoration
    Cheng, Harry
    Guo, Yangyang
    Yin, Jianhua
    Chen, Haonan
    Wang, Jiafang
    Nie, Liqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4110 - 4122
  • [32] Touch the Sound: Audio-Driven Tactile Feedback for Audio Mixing Applications
    Merchel, Sebastian
    Altinsoy, M. Ercan
    Stamm, Maik
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2012, 60 (1-2): : 47 - 53
  • [33] Audio-Driven Deformation Flow for Effective Lip Reading
    Feng, Dalu
    Yang, Shuang
    Shan, Shiguang
    Chen, Xilin
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 274 - 280
  • [34] ASVFI: AUDIO-DRIVEN SPEAKER VIDEO FRAME INTERPOLATION
    Wang, Qianrui
    Li, Dengshi
    Liao, Liang
    Song, Hao
    Li, Wei
    Xiao, Jing
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3200 - 3204
  • [35] Audio-driven human body motion analysis and synthesis
    Ofli, F.
    Canton-Ferrer, C.
    Tilmanne, J.
    Demir, Y.
    Bozkurt, E.
    Yemez, Y.
    Erzin, E.
    Tekalp, A. M.
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 2233 - +
  • [36] Audio-driven Talking Face Video Generation with Emotion
    Liang, Jiadong
    Lu, Feng
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES ABSTRACTS AND WORKSHOPS, VRW 2024, 2024, : 863 - 864
  • [37] Leveraging Language Models and Audio-Driven Dynamic Facial Motion Synthesis: A New Paradigm in AI-Driven Interview Training
    Garg, Aakash
    Chaudhury, Rohan
    Godbole, Mihir
    Seo, Jinsil Hwaryoung
    ARTIFICIAL INTELLIGENCE IN EDUCATION: POSTERS AND LATE BREAKING RESULTS, WORKSHOPS AND TUTORIALS, INDUSTRY AND INNOVATION TRACKS, PRACTITIONERS, DOCTORAL CONSORTIUM AND BLUE SKY, AIED 2024, PT I, 2024, 2150 : 461 - 468
  • [38] Data-driven Facial Animation via Hypergraph Learning
    Li, Xi
    Yu, Jun
    Gao, Fei
    Zhang, Jian
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 442 - 445
  • [39] Speech driven facial animation
    Yang, TJ
    Lin, IC
    Hung, CS
    Huang, CF
    Ming, OY
    COMPUTER ANIMATION AND SIMULATION'99, 1999, : 99 - 108
  • [40] Audio-Driven Co-Speech Gesture Video Generation
    Liu, Xian
    Wu, Qianyi
    Zhou, Hang
    Du, Yuanqi
    Wu, Wayne
    Lin, Dahua
    Liu, Ziwei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,