An audio-driven dancing avatar

被引:0
|
作者
Ferda Ofli
Yasemin Demir
Yücel Yemez
Engin Erzin
A. Murat Tekalp
Koray Balcı
İdil Kızoğlu
Lale Akarun
Cristian Canton-Ferrer
Joëlle Tilmanne
Elif Bozkurt
A. Tanju Erdem
机构
[1] Koç University,Multimedia, Vision and Graphics Laboratory
[2] Boğaziçi University,Multimedia Group
[3] Technical University of Catalonia,Image and Video Processing Group
[4] Faculty of Engineering of Mons,TCTS Lab
[5] Momentum Digital Media Technologies,undefined
来源
关键词
Multicamera motion capture; Audio-driven body motion synthesis; Dancing avatar animation;
D O I
暂无
中图分类号
学科分类号
摘要
We present a framework for training and synthesis of an audio-driven dancing avatar. The avatar is trained for a given musical genre using the multicamera video recordings of a dance performance. The video is analyzed to capture the time-varying posture of the dancer’s body whereas the musical audio signal is processed to extract the beat information. We consider two different marker-based schemes for the motion capture problem. The first scheme uses 3D joint positions to represent the body motion whereas the second uses joint angles. Body movements of the dancer are characterized by a set of recurring semantic motion patterns, i.e., dance figures. Each dance figure is modeled in a supervised manner with a set of HMM (Hidden Markov Model) structures and the associated beat frequency. In the synthesis phase, an audio signal of unknown musical type is first classified, within a time interval, into one of the genres that have been learnt in the analysis phase, based on mel frequency cepstral coefficients (MFCC). The motion parameters of the corresponding dance figures are then synthesized via the trained HMM structures in synchrony with the audio signal based on the estimated tempo information. Finally, the generated motion parameters, either the joint angles or the 3D joint positions of the body, are animated along with the musical audio using two different animation tools that we have developed. Experimental results demonstrate the effectiveness of the proposed framework.
引用
收藏
页码:93 / 103
页数:10
相关论文
共 50 条
  • [31] DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
    Shen, Shuai
    Zhao, Wenliang
    Meng, Zibin
    Li, Wanhua
    Zhu, Zheng
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 1982 - 1991
  • [32] Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models
    Alexanderson, Simon
    Nagy, Rajmund
    Beskow, Jonas
    Henter, Gustav Eje
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (04):
  • [33] Efficient Emotional Adaptation for Audio-Driven Talking-Head Generation
    Gan, Yuan
    Yang, Zongxin
    Yue, Xihang
    Sun, Lingyun
    Yang, Yi
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22577 - 22588
  • [34] FaceTalk: Audio-Driven Motion Diffusion for Neural Parametric Head Models
    Aneja, Shivangi
    Thies, Justus
    Dail, Angela
    Niessner, Matthias
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 21263 - 21273
  • [35] Audio-Driven Lips and Expression on 3D Human Face
    Ma, Le
    Ma, Zhihao
    Meng, Weiliang
    Xu, Shibiao
    Zhang, Xiaopeng
    ADVANCES IN COMPUTER GRAPHICS, CGI 2023, PT II, 2024, 14496 : 15 - 26
  • [36] Emotional Semantic Neural Radiance Fields for Audio-Driven Talking Head
    Lin, Haodong
    Wu, Zhonghao
    Zhang, Zhenyu
    Ma, Chao
    Yang, Xiaokang
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT II, 2022, 13605 : 532 - 544
  • [37] Let's Play Music: Audio-driven Performance Video Generation
    Zhu, Hao
    Li, Yi
    Zhu, Feixia
    Zheng, Aihua
    He, Ran
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3574 - 3581
  • [38] Audio-driven talking face generation with diverse yet realistic facial animations
    Wu, Rongliang
    Yu, Yingchen
    Zhan, Fangneng
    Zhang, Jiahui
    Zhang, Xiaoqin
    Lu, Shijian
    PATTERN RECOGNITION, 2023, 144
  • [39] Imitating Arbitrary Talking Style for Realistic Audio-Driven Talking Face Synthesis
    Wu, Haozhe
    Jia, Jia
    Wang, Haoyu
    Dou, Yishun
    Duan, Chao
    Deng, Qingshan
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1478 - 1486
  • [40] MODA: Mapping-Once Audio-driven Portrait Animation with Dual Attentions
    Liu, Yunfei
    Lin, Lijian
    Yu, Fei
    Zhou, Changyin
    Li, Yu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22963 - 22972