An audio-driven dancing avatar

被引:0
|
作者
Ferda Ofli
Yasemin Demir
Yücel Yemez
Engin Erzin
A. Murat Tekalp
Koray Balcı
İdil Kızoğlu
Lale Akarun
Cristian Canton-Ferrer
Joëlle Tilmanne
Elif Bozkurt
A. Tanju Erdem
机构
[1] Koç University,Multimedia, Vision and Graphics Laboratory
[2] Boğaziçi University,Multimedia Group
[3] Technical University of Catalonia,Image and Video Processing Group
[4] Faculty of Engineering of Mons,TCTS Lab
[5] Momentum Digital Media Technologies,undefined
来源
关键词
Multicamera motion capture; Audio-driven body motion synthesis; Dancing avatar animation;
D O I
暂无
中图分类号
学科分类号
摘要
We present a framework for training and synthesis of an audio-driven dancing avatar. The avatar is trained for a given musical genre using the multicamera video recordings of a dance performance. The video is analyzed to capture the time-varying posture of the dancer’s body whereas the musical audio signal is processed to extract the beat information. We consider two different marker-based schemes for the motion capture problem. The first scheme uses 3D joint positions to represent the body motion whereas the second uses joint angles. Body movements of the dancer are characterized by a set of recurring semantic motion patterns, i.e., dance figures. Each dance figure is modeled in a supervised manner with a set of HMM (Hidden Markov Model) structures and the associated beat frequency. In the synthesis phase, an audio signal of unknown musical type is first classified, within a time interval, into one of the genres that have been learnt in the analysis phase, based on mel frequency cepstral coefficients (MFCC). The motion parameters of the corresponding dance figures are then synthesized via the trained HMM structures in synchrony with the audio signal based on the estimated tempo information. Finally, the generated motion parameters, either the joint angles or the 3D joint positions of the body, are animated along with the musical audio using two different animation tools that we have developed. Experimental results demonstrate the effectiveness of the proposed framework.
引用
收藏
页码:93 / 103
页数:10
相关论文
共 50 条
  • [21] Parametric Implicit Face Representation for Audio-Driven Facial Reenactment
    Huang, Ricong
    Lai, Peiwen
    Qin, Yipeng
    Li, Guanbin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12759 - 12768
  • [22] Audio-driven emotional speech animation for interactive virtual characters
    Charalambous, Constantinos
    Yumak, Zerrin
    van der Stappen, A. Frank
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2019, 30 (3-4)
  • [23] Partial linear regresston for audio-driven talking head application
    Hsieh, CK
    Chen, YC
    2005 IEEE International Conference on Multimedia and Expo (ICME), Vols 1 and 2, 2005, : 281 - 284
  • [24] Audio-driven Neural Gesture Reenactment with Video Motion Graphs
    Zhou, Yang
    Yang, Jimei
    Li, Dingzeyu
    Saito, Jun
    Aneja, Deepali
    Kalogerakis, Evangelos
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3408 - 3418
  • [25] Spatially and Temporally Optimized Audio-Driven Talking Face Generation
    Dong, Biao
    Ma, Bo-Yao
    Zhang, Lei
    COMPUTER GRAPHICS FORUM, 2024, 43 (07)
  • [26] Audio2AB:Audio-driven collaborative generation of virtual character animation
    Lichao NIU
    Wenjun XIE
    Dong WANG
    Zhongrui CAO
    Xiaoping LIU
    虚拟现实与智能硬件(中英文), 2024, 6 (01) : 56 - 70
  • [27] PADVG: A Simple Baseline of Active Protection for Audio-Driven Video Generation
    Liu, Huan
    Liu, Xiaolong
    Tan, Zichang
    Li, Xiaolong
    Zhao, Yao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (06)
  • [28] Audio-Driven Stylized Gesture Generation with Flow-Based Model
    Ye, Sheng
    Wen, Yu-Hui
    Sun, Yanan
    He, Ying
    Zhang, Ziyang
    Wang, Yaoyuan
    He, Weihua
    Liu, Yong-Jin
    COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 712 - 728
  • [29] EmoFace: Audio-driven Emotional 3D Face Animation
    Liu, Chang
    Lin, Qunfen
    Zeng, Zijiao
    Pan, Ye
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 387 - 397
  • [30] Audio-Driven Talking Face Video Generation With Dynamic Convolution Kernels
    Ye, Zipeng
    Xia, Mengfei
    Yi, Ran
    Zhang, Juyong
    Lai, Yu-Kun
    Huang, Xuwei
    Zhang, Guoxin
    Liu, Yong-Jin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2033 - 2046