Text2Video: Text-driven facial animation using MPEG-4

被引：0

作者：

Rurainsky, J ^{[1
]}

Eisert, P ^{[1
]}

机构：

[1] Heinrich Hertz Inst Nachrichtentech Berlin GmbH, Fraunhofer Inst Telecommun, Image Proc Dept, D-10587 Berlin, Germany

来源：

VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2005, PTS 1-4 | 2005年 / 5960卷

关键词：

MPEG-4; facial animation; text-driven animation; SMS; MMS;

D O I：

10.1117/12.631413

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

We present a complete system for the automatic creation of talking head video sequences from text messages. Our system converts the text into MPEG-4 Facial Animation Parameters and synthetic voice. A user selected 3D character will perform lip movements synchronized to the speech data. The 3D models created from a single image vary from realistic people to cartoon characters. A voice selection for different languages and gender as well as a pitch shift component enables a personalization of the animation. The animation can be shown on different displays and devices ranging from 3GPP players on mobile phones to real-time 3D render engines. Therefore, our system can be used in mobile communication for the conversion of regular SMS messages to MMS animations.

引用

页码：492 / 500

页数：9

共 50 条

[1] TEXT2VIDEO: TEXT-DRIVEN TALKING-HEAD VIDEO SYNTHESIS WITH PERSONALIZED PHONEME - POSE DICTIONARY
Zhang, Sibo
Yuan, Jiahong
Liao, Miao
Zhang, Liangjun
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2659 - 2663
[2] MPEG-4 facial animation in video analysis and synthesis
Eisert, P
[J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2003, 13 (05) : 245 - 256
[3] Text-Driven Video Prediction
Song, Xue
Chen, Jingjing
Zhu, Bin
Jiang, Yu-Gang
[J]. ACM Transactions on Multimedia Computing, Communications and Applications, 2024, 20 (09)
[4] Text-driven Speech Animation with Emotion Control
Chae, Wonseok
Kim, Yejin
[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (08): : 3473 - 3487
[5] An MPEG-4 facial animation system driven by synthetic speech
Lande, C
Francini, G
[J]. 1998 MULTIMEDIA MODELING, PROCEEDINGS, 1998, : 203 - 212
[6] Text2Performer: Text-Driven Human Video Generation
Jiang, Yuming
Yang, Shuai
Koh, Tong Liang
Wu, Wayne
Loy, Chen Change
Liu, Ziwei
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22690 - 22700
[7] Text2LIVE: Text-Driven Layered Image and Video Editing
Bar-Tal, Omer
Ofri-Amar, Dolev
Fridman, Rafail
Kasten, Yoni
Dekel, Tali
[J]. COMPUTER VISION - ECCV 2022, PT XV, 2022, 13675 : 707 - 723
[8] Text2Video: Automatic Video Generation Based on Text Scripts
Yu, Yipeng
Tu, Zirui
Lu, Longyu
Chen, Xiao
Zhan, Hui
Sun, Zixun
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2753 - 2755
[9] Lip tracking for MPEG-4 facial animation
Wu, ZL
Aleksic, PS
Katsaggelos, AK
[J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 293 - 298
[10] Optimizing facial animation parameters for MPEG-4
Hovden, G
Ling, N
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2003, 49 (04) : 1354 - 1359

← 1 2 3 4 5 →