An audio-driven dancing avatar

被引:0
|
作者
Ferda Ofli
Yasemin Demir
Yücel Yemez
Engin Erzin
A. Murat Tekalp
Koray Balcı
İdil Kızoğlu
Lale Akarun
Cristian Canton-Ferrer
Joëlle Tilmanne
Elif Bozkurt
A. Tanju Erdem
机构
[1] Koç University,Multimedia, Vision and Graphics Laboratory
[2] Boğaziçi University,Multimedia Group
[3] Technical University of Catalonia,Image and Video Processing Group
[4] Faculty of Engineering of Mons,TCTS Lab
[5] Momentum Digital Media Technologies,undefined
来源
关键词
Multicamera motion capture; Audio-driven body motion synthesis; Dancing avatar animation;
D O I
暂无
中图分类号
学科分类号
摘要
We present a framework for training and synthesis of an audio-driven dancing avatar. The avatar is trained for a given musical genre using the multicamera video recordings of a dance performance. The video is analyzed to capture the time-varying posture of the dancer’s body whereas the musical audio signal is processed to extract the beat information. We consider two different marker-based schemes for the motion capture problem. The first scheme uses 3D joint positions to represent the body motion whereas the second uses joint angles. Body movements of the dancer are characterized by a set of recurring semantic motion patterns, i.e., dance figures. Each dance figure is modeled in a supervised manner with a set of HMM (Hidden Markov Model) structures and the associated beat frequency. In the synthesis phase, an audio signal of unknown musical type is first classified, within a time interval, into one of the genres that have been learnt in the analysis phase, based on mel frequency cepstral coefficients (MFCC). The motion parameters of the corresponding dance figures are then synthesized via the trained HMM structures in synchrony with the audio signal based on the estimated tempo information. Finally, the generated motion parameters, either the joint angles or the 3D joint positions of the body, are animated along with the musical audio using two different animation tools that we have developed. Experimental results demonstrate the effectiveness of the proposed framework.
引用
收藏
页码:93 / 103
页数:10
相关论文
共 50 条
  • [1] An audio-driven dancing avatar
    Ofli, Ferda
    Demir, Yasemin
    Yemez, Yucel
    Erzin, Engin
    Tekalp, A. Murat
    Balci, Koray
    Kizoglu, Idil
    Akarun, Lale
    Canton-Ferrer, Cristian
    Tilmanne, Joelle
    Bozkurt, Elif
    Erdem, A. Tanju
    JOURNAL ON MULTIMODAL USER INTERFACES, 2008, 2 (02) : 93 - 103
  • [2] Photorealistic Audio-driven Video Portraits
    Wen, Xin
    Wang, Miao
    Richardt, Christian
    Chen, Ze-Yin
    Hu, Shi-Min
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (12) : 3457 - 3466
  • [3] Audio-Driven Laughter Behavior Controller
    Ding, Yu
    Huang, Jing
    Pelachaud, Catherine
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2017, 8 (04) : 546 - 558
  • [4] Audio-Driven Emotional Video Portraits
    Ji, Xinya
    Zhou, Hang
    Wang, Kaisiyuan
    Wu, Wayne
    Loy, Chen Change
    Cao, Xun
    Xu, Feng
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14075 - 14084
  • [5] Audio-Driven Multimedia Content Authentication as a Service
    Vryzas, Nikolaos
    Katsaounidou, Anastasia
    Kotsakis, Rigas
    Dimoulas, Charalampos
    Kalliris, George
    146TH AES CONVENTION, 2019,
  • [6] Audio-Driven Talking Face Generation: A Review
    Liu, Shiguang
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2023, 71 (7-8): : 408 - 419
  • [7] Audio-Driven Talking Video Frame Restoration
    Cheng, Harry
    Guo, Yangyang
    Yin, Jianhua
    Chen, Haonan
    Wang, Jiafang
    Nie, Liqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4110 - 4122
  • [8] Audio-Driven Facial Animation with Deep Learning: A Survey
    Jiang, Diqiong
    Chang, Jian
    You, Lihua
    Bian, Shaojun
    Kosk, Robert
    Maguire, Greg
    INFORMATION, 2024, 15 (11)
  • [9] Touch the Sound: Audio-Driven Tactile Feedback for Audio Mixing Applications
    Merchel, Sebastian
    Altinsoy, M. Ercan
    Stamm, Maik
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2012, 60 (1-2): : 47 - 53
  • [10] Multi-Task Audio-Driven Facial Animation
    Kim, Youngsoo
    An, Shounan
    Jo, Youngbak
    Park, Seungje
    Kang, Shindong
    Oh, Insoo
    Kim, Duke Donghyun
    SIGGRAPH '19 - ACM SIGGRAPH 2019 POSTERS, 2019,