An audio-driven dancing avatar

被引:0
|
作者
Ferda Ofli
Yasemin Demir
Yücel Yemez
Engin Erzin
A. Murat Tekalp
Koray Balcı
İdil Kızoğlu
Lale Akarun
Cristian Canton-Ferrer
Joëlle Tilmanne
Elif Bozkurt
A. Tanju Erdem
机构
[1] Koç University,Multimedia, Vision and Graphics Laboratory
[2] Boğaziçi University,Multimedia Group
[3] Technical University of Catalonia,Image and Video Processing Group
[4] Faculty of Engineering of Mons,TCTS Lab
[5] Momentum Digital Media Technologies,undefined
来源
关键词
Multicamera motion capture; Audio-driven body motion synthesis; Dancing avatar animation;
D O I
暂无
中图分类号
学科分类号
摘要
We present a framework for training and synthesis of an audio-driven dancing avatar. The avatar is trained for a given musical genre using the multicamera video recordings of a dance performance. The video is analyzed to capture the time-varying posture of the dancer’s body whereas the musical audio signal is processed to extract the beat information. We consider two different marker-based schemes for the motion capture problem. The first scheme uses 3D joint positions to represent the body motion whereas the second uses joint angles. Body movements of the dancer are characterized by a set of recurring semantic motion patterns, i.e., dance figures. Each dance figure is modeled in a supervised manner with a set of HMM (Hidden Markov Model) structures and the associated beat frequency. In the synthesis phase, an audio signal of unknown musical type is first classified, within a time interval, into one of the genres that have been learnt in the analysis phase, based on mel frequency cepstral coefficients (MFCC). The motion parameters of the corresponding dance figures are then synthesized via the trained HMM structures in synchrony with the audio signal based on the estimated tempo information. Finally, the generated motion parameters, either the joint angles or the 3D joint positions of the body, are animated along with the musical audio using two different animation tools that we have developed. Experimental results demonstrate the effectiveness of the proposed framework.
引用
收藏
页码:93 / 103
页数:10
相关论文
共 50 条
  • [41] SoundToons: Exemplar-Based Authoring of Interactive Audio-Driven Animation Sprites
    Chong, Toby
    Shin, Hijung Valentina
    Aneja, Deepali
    Igarashi, Takeo
    PROCEEDINGS OF 2023 28TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2023, 2023, : 710 - 722
  • [42] Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation
    Zhu, Lingting
    Liu, Xian
    Liu, Xuanyu
    Qian, Rui
    Liu, Ziwei
    Yu, Lequan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10544 - 10553
  • [43] Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
    Liu, Xian
    Xu, Yinghao
    Wu, Qianyi
    Zhou, Hang
    Wu, Wayne
    Zhou, Bolei
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 106 - 125
  • [44] Audio-driven Talking Head Generation with Transformer and 3D Morphable Model
    Huang, Ricong
    Zhong, Weizhi
    Li, Guanbin
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7035 - 7039
  • [45] Audio2Moves: Two-Level Hierarchical Framework for Audio-Driven Human Motion Synthesis
    Yanbo Cheng
    Nada Elmasry
    Yingying Wang
    SN Computer Science, 6 (5)
  • [46] Voice2Face: Audio-driven Facial and Tongue Rig Animations with cVAEs
    Aylagas, Monica Villanueva
    Leon, Hector Anadon
    Teye, Mattias
    Tollmar, Konrad
    COMPUTER GRAPHICS FORUM, 2022, 41 (08) : 255 - 265
  • [47] Sem-Avatar: Semantic Controlled Neural Field for High-Fidelity Audio Driven Avatar
    Zhou, Xiang
    Zhang, Weichen
    Ding, Yikang
    Zhou, Fan
    Zhang, Kai
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT II, 2024, 14426 : 66 - 78
  • [48] MergeTalk: Audio-Driven Talking Head Generation From Single Image With Feature Merge
    Gao, Jian
    Shu, Chang
    Zheng, Ximin
    Lu, Zheng
    Bao, Nengsheng
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1850 - 1854
  • [49] DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models
    Yang, Sicheng
    Wu, Zhiyong
    Li, Minglei
    Zhang, Zhensong
    Hao, Lei
    Bao, Weihong
    Cheng, Ming
    Xiao, Long
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5860 - 5868
  • [50] Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion
    Karras, Tero
    Aila, Timo
    Laine, Samuli
    Herva, Antti
    Lehtinen, Jaakko
    ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):