CSTalk: Correlation Supervised Speech-driven 3D Emotional Facial Animation Generation

被引：0

作者：

Liang, Xiangyu ^{[1
]}

Zhuang, Wenlin ^{[1
]}

Wang, Tianyong ^{[1
]}

Geng, Guangxing ^{[2
]}

Geng, Guangyue ^{[2
]}

Xia, Haifeng ^{[1
]}

Xia, Siyu ^{[1
]}

机构：

[1] Southeast Univ, Sch Automat, Nanjing, Peoples R China

[2] Nanjing 8 8 Digital Technol Co Ltd, Nanjing, Peoples R China

来源：

2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024 | 2024年

关键词：

D O I：

10.1109/FG59268.2024.10581920

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Speech-driven 3D facial animation technology has been developed for years, but its practical application still lacks expectations. The main challenges lie in data limitations, lip alignment, and the naturalness of facial expressions. Although lip alignment has seen many related studies, existing methods struggle to synthesize natural and realistic expressions, resulting in a mechanical and stiff appearance of facial animations. Even with some research extracting emotional features from speech, the randomness of facial movements limits the effective expression of emotions. To address this issue, this paper proposes a method called CSTalk (Correlation Supervised) that models the correlations among different regions of facial movements and supervises the training of the generative model to generate realistic expressions that conform to human facial motion patterns. To generate more intricate animations, we employ a rich set of control parameters based on the metahuman character model and capture a dataset for five different emotions. We train a generative network using an autoencoder structure and input an emotion embedding vector to achieve the generation of user-control expressions. Experimental results demonstrate that our method outperforms existing state-of-the-art methods.

引用

页数：5

共 50 条

[1] Speech-Driven 3D Facial Animation with Mesh Convolution
Ji, Xuejie
Su, Zewei
Dong, Lanfang
Li, Guoming
2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 14 - 18
[2] Imitator: Personalized Speech-driven 3D Facial Animation
Thambiraja, Balamurugan
Habibie, Ikhsanul
Aliakbarian, Sadegh
Cosker, Darren
Theobalt, Christian
Thies, Justus
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20564 - 20574
[3] Speech-driven 3D Facial Animation for Mobile Entertainment
Yan, Juan
Xie, Xiang
Hu, Hao
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2334 - 2337
[4] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
Fan, Yingruo
Lin, Zhaojiang
Saito, Jun
Wang, Wenping
Komura, Taku
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18749 - 18758
[5] Speech-driven 3D Facial Animation with Implicit Emotional Awareness: A Deep Learning Approach
Pham, Hai X.
Cheung, Samuel
Pavlovic, Vladimir
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 2328 - 2336
[6] CLTalk: Speech-Driven 3D Facial Animation with Contrastive Learning
Zhang, Xitie
Wu, Suping
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1175 - 1179
[7] EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation
Peng, Ziqiao
Wu, Haoyu
Song, Zhenbo
Xu, Hao
Zhu, Xiangyu
He, Jun
Liu, Hongyan
Fan, Zhaoxin
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20630 - 20640
[8] CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Xing, Jinbo
Xia, Menghan
Zhang, Yuechen
Cun, Xiaodong
Wang, Jue
Wong, Tien-Tsin
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12780 - 12790
[9] FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Stan, Stefan
Haque, Kazi Injamamul
Yumak, Zerrin
15TH ANNUAL ACM SIGGRAPH CONFERENCE ON MOTION, INTERACTION AND GAMES, MIG 2023, 2023,
[10] Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation
Fu, Hui
Wang, Zeqing
Gong, Ke
Wang, Keze
Chen, Tianshui
Li, Haojie
Zeng, Haifeng
Kang, Wenxiong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 1770 - 1777

← 1 2 3 4 5 →