A comprehensive system for facial animation of generic 3D head models driven by speech

被引：0

作者：

Lucas D Terissi

Mauricio Cerda

Juan C Gómez

Nancy Hitschfeld-Kahler

Bernard Girau

机构：

[1] Universidad Nacional de Rosario and CIFASIS,Laboratory for System Dynamics & Signal Processing

[2] Universidad de Chile,SCIAN

[3] Universidad de Chile,Lab, Faculty of Medicine

[4] Loria - INRIA Nancy Grand Est,Computer Science Department, FCFyM

[5] Cortex Team,undefined

来源：

EURASIP Journal on Audio, Speech, and Music Processing | / 2013卷

关键词：

Facial animation; Hidden Markov models; Audio-visual speech processing;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

A comprehensive system for facial animation of generic 3D head models driven by speech is presented in this article. In the training stage, audio-visual information is extracted from audio-visual training data, and then used to compute the parameters of a single joint audio-visual hidden Markov model (AV-HMM). In contrast to most of the methods in the literature, the proposed approach does not require segmentation/classification processing stages of the audio-visual data, avoiding the error propagation related to these procedures. The trained AV-HMM provides a compact representation of the audio-visual data, without the need of phoneme (word) segmentation, which makes it adaptable to different languages. Visual features are estimated from the speech signal based on the inversion of the AV-HMM. The estimated visual speech features are used to animate a simple face model. The animation of a more complex head model is then obtained by automatically mapping the deformation of the simple model to it, using a small number of control points for the interpolation. The proposed algorithm allows the animation of 3D head models of arbitrary complexity through a simple setup procedure. The resulting animation is evaluated in terms of intelligibility of visual speech through perceptual tests, showing a promising performance. The computational complexity of the proposed system is analyzed, showing the feasibility of its real-time implementation.

引用

共 50 条

[1] A comprehensive system for facial animation of generic 3D head models driven by speech
Terissi, Lucas D.
Cerda, Mauricio
Gomez, Juan C.
Hitschfeld-Kahler, Nancy
Girau, Bernard
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,
[2] ANIMATION OF GENERIC 3D HEAD MODELS DRIVEN BY SPEECH
Terissi, Lucas
Cerda, Mauricio
Gomez, Juan C.
Hitschfeld-Kahler, Nancy
Girau, Bernard
Valenzuela, Renato
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
[3] DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Sun, Zhiyao
Lv, Tian
Ye, Sheng
Lin, Matthieu
Sheng, Jenny
Wen, Yu-Hui
Yu, Minjing
Liu, Yong-Jin
[J]. ACM TRANSACTIONS ON GRAPHICS, 2024, 43 (04):
[4] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
Fan, Yingruo
Lin, Zhaojiang
Saito, Jun
Wang, Wenping
Komura, Taku
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18749 - 18758
[5] Speech-driven 3D Facial Animation for Mobile Entertainment
Yan, Juan
Xie, Xiang
Hu, Hao
[J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2334 - 2337
[6] Imitator: Personalized Speech-driven 3D Facial Animation
Thambiraja, Balamurugan
Habibie, Ikhsanul
Aliakbarian, Sadegh
Cosker, Darren
Theobalt, Christian
Thies, Justus
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20564 - 20574
[7] Speech-Driven 3D Facial Animation with Mesh Convolution
Ji, Xuejie
Su, Zewei
Dong, Lanfang
Li, Guoming
[J]. 2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 14 - 18
[8] CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning
Zhang, Xitie
Wu, Suping
[J]. PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1175 - 1179
[9] Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation
He, Shan
He, Haonan
Yang, Shuo
Wu, Xiaoyan
Xia, Pengcheng
Yin, Bing
Liu, Cong
Dai, Lirong
Xu, Chang
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14146 - 14156
[10] Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
Fu, Hui
Wang, Zeqing
Gong, Ke
Wang, Keze
Chen, Tianshui
Li, Haojie
Zeng, Haifeng
Kang, Wenxiong
[J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1770 - 1777

← 1 2 3 4 5 →