A comprehensive system for facial animation of generic 3D head models driven by speech

被引:0
|
作者
Lucas D Terissi
Mauricio Cerda
Juan C Gómez
Nancy Hitschfeld-Kahler
Bernard Girau
机构
[1] Universidad Nacional de Rosario and CIFASIS,Laboratory for System Dynamics & Signal Processing
[2] Universidad de Chile,SCIAN
[3] Universidad de Chile,Lab, Faculty of Medicine
[4] Loria - INRIA Nancy Grand Est,Computer Science Department, FCFyM
[5] Cortex Team,undefined
关键词
Facial animation; Hidden Markov models; Audio-visual speech processing;
D O I
暂无
中图分类号
学科分类号
摘要
A comprehensive system for facial animation of generic 3D head models driven by speech is presented in this article. In the training stage, audio-visual information is extracted from audio-visual training data, and then used to compute the parameters of a single joint audio-visual hidden Markov model (AV-HMM). In contrast to most of the methods in the literature, the proposed approach does not require segmentation/classification processing stages of the audio-visual data, avoiding the error propagation related to these procedures. The trained AV-HMM provides a compact representation of the audio-visual data, without the need of phoneme (word) segmentation, which makes it adaptable to different languages. Visual features are estimated from the speech signal based on the inversion of the AV-HMM. The estimated visual speech features are used to animate a simple face model. The animation of a more complex head model is then obtained by automatically mapping the deformation of the simple model to it, using a small number of control points for the interpolation. The proposed algorithm allows the animation of 3D head models of arbitrary complexity through a simple setup procedure. The resulting animation is evaluated in terms of intelligibility of visual speech through perceptual tests, showing a promising performance. The computational complexity of the proposed system is analyzed, showing the feasibility of its real-time implementation.
引用
收藏
相关论文
共 50 条
  • [1] A comprehensive system for facial animation of generic 3D head models driven by speech
    Terissi, Lucas D.
    Cerda, Mauricio
    Gomez, Juan C.
    Hitschfeld-Kahler, Nancy
    Girau, Bernard
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,
  • [2] ANIMATION OF GENERIC 3D HEAD MODELS DRIVEN BY SPEECH
    Terissi, Lucas
    Cerda, Mauricio
    Gomez, Juan C.
    Hitschfeld-Kahler, Nancy
    Girau, Bernard
    Valenzuela, Renato
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [3] DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
    Sun, Zhiyao
    Lv, Tian
    Ye, Sheng
    Lin, Matthieu
    Sheng, Jenny
    Wen, Yu-Hui
    Yu, Minjing
    Liu, Yong-Jin
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
  • [4] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
    Fan, Yingruo
    Lin, Zhaojiang
    Saito, Jun
    Wang, Wenping
    Komura, Taku
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18749 - 18758
  • [5] Speech-driven 3D Facial Animation for Mobile Entertainment
    Yan, Juan
    Xie, Xiang
    Hu, Hao
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2334 - 2337
  • [6] Imitator: Personalized Speech-driven 3D Facial Animation
    Thambiraja, Balamurugan
    Habibie, Ikhsanul
    Aliakbarian, Sadegh
    Cosker, Darren
    Theobalt, Christian
    Thies, Justus
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20564 - 20574
  • [7] Speech-Driven 3D Facial Animation with Mesh Convolution
    Ji, Xuejie
    Su, Zewei
    Dong, Lanfang
    Li, Guoming
    [J]. 2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 14 - 18
  • [8] CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning
    Zhang, Xitie
    Wu, Suping
    [J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1175 - 1179
  • [9] Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation
    He, Shan
    He, Haonan
    Yang, Shuo
    Wu, Xiaoyan
    Xia, Pengcheng
    Yin, Bing
    Liu, Cong
    Dai, Lirong
    Xu, Chang
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14146 - 14156
  • [10] Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
    Fu, Hui
    Wang, Zeqing
    Gong, Ke
    Wang, Keze
    Chen, Tianshui
    Li, Haojie
    Zeng, Haifeng
    Kang, Wenxiong
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1770 - 1777