Text2Video: Text-driven facial animation using MPEG-4

被引:0
|
作者
Rurainsky, J [1 ]
Eisert, P [1 ]
机构
[1] Heinrich Hertz Inst Nachrichtentech Berlin GmbH, Fraunhofer Inst Telecommun, Image Proc Dept, D-10587 Berlin, Germany
关键词
MPEG-4; facial animation; text-driven animation; SMS; MMS;
D O I
10.1117/12.631413
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
We present a complete system for the automatic creation of talking head video sequences from text messages. Our system converts the text into MPEG-4 Facial Animation Parameters and synthetic voice. A user selected 3D character will perform lip movements synchronized to the speech data. The 3D models created from a single image vary from realistic people to cartoon characters. A voice selection for different languages and gender as well as a pitch shift component enables a personalization of the animation. The animation can be shown on different displays and devices ranging from 3GPP players on mobile phones to real-time 3D render engines. Therefore, our system can be used in mobile communication for the conversion of regular SMS messages to MMS animations.
引用
收藏
页码:492 / 500
页数:9
相关论文
共 50 条
  • [31] Text2Human: Text-Driven Controllable Human Image Generation
    Jiang, Yuming
    Yang, Shuai
    Qju, Haonan
    Wu, Wayne
    Loy, Chen Change
    Liu, Ziwei
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
  • [32] Audiovisual integration in multimedia communications based on MPEG-4 facial animation
    Z. S. Bojkovic
    D. A. Milovanovic
    [J]. Circuits, Systems and Signal Processing, 2001, 20 : 311 - 339
  • [33] Audiovisual integration in multimedia communications based on MPEG-4 facial animation
    Bojkovic, ZS
    Milovanovic, DA
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2001, 20 (3-4) : 311 - 339
  • [34] Real time facial expression recognition system with applications to facial animation in MPEG-4
    Chandrasiri, Naiwala Pathirannehelage
    Naemura, Takeshi
    Harashima, Hiroshi
    [J]. IEICE Transactions on Information and Systems, 2001, E84-D (08) : 1007 - 1017
  • [35] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
    Chai, Wenhao
    Guo, Xun
    Wang, Gaoang
    Lu, Yan
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22983 - 22993
  • [36] Real time facial expression recognition system with applications to facial animation in MPEG-4
    Chandrasiri, NP
    Naemura, T
    Harashima, H
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (08): : 1007 - 1017
  • [37] ETRI innovation: MPEG-4 text-to-speech interface
    不详
    [J]. ETRI JOURNAL, 1999, 21 (02) : 40 - 41
  • [38] Text2Tex: Text-driven Texture Synthesis via Diffusion Models
    Chen, Dave Zhenyu
    Siddiqui, Yawar
    Lee, Hsin-Ying
    Tulyakov, Sergey
    Niessner, Matthias
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18512 - 18522
  • [39] Weakly-Supervised Text-driven Contrastive Learning for Facial Behavior Understanding
    Zhang, Xiang
    Wang, Taoyue
    Li, Xiaotian
    Yang, Huiyuan
    Yin, Lijun
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 20694 - 20705
  • [40] Design and Implementation of 3D Facial Animation Based on MPEG-4
    Yong Jianhua
    Cheng Ping-guang
    [J]. MATERIALS SCIENCE AND INFORMATION TECHNOLOGY, PTS 1-8, 2012, 433-440 : 5045 - 5049