Motion to Dance Music Generation using Latent Diffusion Model

被引:2
|
作者
Tan, Vanessa [1 ]
Nam, JungHyun [1 ]
Nam, Juhan [1 ]
Noh, Junyong [1 ]
机构
[1] KAIST GSCT, Daejeon, South Korea
基金
新加坡国家研究基金会;
关键词
3D motion to music; music generation; latent diffusion model;
D O I
10.1145/3610543.3626164
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The role of music in games and animation, particularly in dance content, is essential for creating immersive and entertaining experiences. Although recent studies have made strides in generating dance music from videos, their practicality in integrating music into games and animation remains limited. In this context, we present a method capable of generating plausible dance music from 3D motion data and genre labels. Our approach leverages a combination of a UNET-based latent diffusion model and a pre-trained VAE model. To evaluate the performance of the proposed model, we employ evaluation metrics to assess various audio properties, including beat alignment, audio quality, motion-music correlation, and genre score. The quantitative results show that our approach outperforms previous methods. Furthermore, we demonstrate that our model can generate audio that seamlessly fits to in-the-wild motion data. This capability enables us to create plausible dance music that complements dynamic movements of characters and enhances overall audiovisual experience in interactive media. Examples from our proposed model are available at this link: https://dmdproject.github.io/.
引用
收藏
页数:4
相关论文
共 50 条
  • [31] Classification of Salsa Dance Level using Music and Interaction based Motion Features
    Senecal, Simon
    Nijdam, Niels A.
    Thalmann, Nadia Magnenat
    PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (GRAPP), VOL 1, 2019, : 100 - 109
  • [32] GLDM: hit molecule generation with constrained graph latent diffusion model
    Wang, Conghao
    Ong, Hiok Hian
    Chiba, Shunsuke
    Rajapakse, Jagath C.
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [33] EDGE: Editable Dance Generation From Music
    Tseng, Jonathan
    Castellon, Rodrigo
    Liu, C. Karen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 448 - 458
  • [34] Music-to-Dance Generation with Multiple Conformer
    Zhang, Mingao
    Liu, Changhong
    Chen, Yong
    Lei, Zhenchun
    Wang, Mingwen
    PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2022, 2022, : 34 - 38
  • [35] SOUL TRAIN The Music, Dance, and Style of a Generation
    McNatt, Rosemary Bray
    NEW YORK TIMES BOOK REVIEW, 2014, 119 (22): : 40 - 40
  • [36] Soul Train: The Music, Dance, and Style of a Generation
    Weisbard, Eric
    AMERICAN QUARTERLY, 2015, 67 (01) : 254 - 265
  • [37] MoVideo: Motion-Aware Video Generation with Diffusion Model
    Liang, Jingyun
    Fang, Yuchen
    Zhang, Kai
    Timofte, Radu
    Van Gool, Luc
    Ranjan, Rakesh
    COMPUTER VISION-ECCV 2024, PT XLIV, 2025, 15102 : 56 - 74
  • [38] Graphusion: Latent Diffusion for Graph Generation
    Yang, Ling
    Huang, Zhilin
    Zhang, Zhilong
    Liu, Zhongyi
    Hong, Shenda
    Zhang, Wentao
    Yang, Wenming
    Cui, Bin
    Zhang, Luxia
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (11) : 6358 - 6369
  • [39] Detecting dance motion structure through music analysis
    Shiratori, T
    Nakazawa, A
    Ikeuchi, K
    SIXTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, 2004, : 857 - 862
  • [40] Music-Dance: Sound and Motion in Contemporary Discourse
    Acevedo, Lucia C.
    DANCE RESEARCH, 2021, 39 (02) : 274 - 277