Text2Video: Text-driven facial animation using MPEG-4

被引：0

作者：

Rurainsky, J ^{[1
]}

Eisert, P ^{[1
]}

机构：

[1] Heinrich Hertz Inst Nachrichtentech Berlin GmbH, Fraunhofer Inst Telecommun, Image Proc Dept, D-10587 Berlin, Germany

来源：

VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2005, PTS 1-4 | 2005年 / 5960卷

关键词：

MPEG-4; facial animation; text-driven animation; SMS; MMS;

D O I：

10.1117/12.631413

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

We present a complete system for the automatic creation of talking head video sequences from text messages. Our system converts the text into MPEG-4 Facial Animation Parameters and synthetic voice. A user selected 3D character will perform lip movements synchronized to the speech data. The 3D models created from a single image vary from realistic people to cartoon characters. A voice selection for different languages and gender as well as a pitch shift component enables a personalization of the animation. The animation can be shown on different displays and devices ranging from 3GPP players on mobile phones to real-time 3D render engines. Therefore, our system can be used in mobile communication for the conversion of regular SMS messages to MMS animations.

引用

页码：492 / 500

页数：9

共 50 条

[31] Text2Human: Text-Driven Controllable Human Image Generation
Jiang, Yuming
Yang, Shuai
Qju, Haonan
Wu, Wayne
Loy, Chen Change
Liu, Ziwei
[J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
[32] Audiovisual integration in multimedia communications based on MPEG-4 facial animation
Z. S. Bojkovic
D. A. Milovanovic
[J]. Circuits, Systems and Signal Processing, 2001, 20 : 311 - 339
[33] Audiovisual integration in multimedia communications based on MPEG-4 facial animation
Bojkovic, ZS
Milovanovic, DA
[J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2001, 20 (3-4) : 311 - 339
[34] Real time facial expression recognition system with applications to facial animation in MPEG-4
Chandrasiri, Naiwala Pathirannehelage
Naemura, Takeshi
Harashima, Hiroshi
[J]. IEICE Transactions on Information and Systems, 2001, E84-D (08) : 1007 - 1017
[35] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Chai, Wenhao
Guo, Xun
Wang, Gaoang
Lu, Yan
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22983 - 22993
[36] Real time facial expression recognition system with applications to facial animation in MPEG-4
Chandrasiri, NP
Naemura, T
Harashima, H
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (08): : 1007 - 1017
[37] ETRI innovation: MPEG-4 text-to-speech interface
不详
[J]. ETRI JOURNAL, 1999, 21 (02) : 40 - 41
[38] Text2Tex: Text-driven Texture Synthesis via Diffusion Models
Chen, Dave Zhenyu
Siddiqui, Yawar
Lee, Hsin-Ying
Tulyakov, Sergey
Niessner, Matthias
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18512 - 18522
[39] Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding
Zhang, Xiang
Wang, Taoyue
Li, Xiaotian
Yang, Huiyuan
Yin, Lijun
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20694 - 20705
[40] Design and Implementation of 3D Facial Animation Based on MPEG-4
Yong Jianhua
Cheng Ping-guang
[J]. MATERIALS SCIENCE AND INFORMATION TECHNOLOGY, PTS 1-8, 2012, 433-440 : 5045 - 5049

← 1 2 3 4 5 →