Executing your Commands via Motion Diffusion in Latent Space

被引:80
|
作者
Chen, Xin [1 ]
Jiang, Biao [2 ]
Liu, Wen [1 ]
Huang, Zilong [1 ]
Fu, Bin [1 ]
Chen, Tao [2 ]
Yu, Gang [1 ]
机构
[1] Tencent PCG, Shenyang, Peoples R China
[2] Fudan Univ, Shanghai, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.01726
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study a challenging task, conditional human motion generation, which produces plausible human motion sequences according to various conditional inputs, such as action classes or textual descriptors. Since human motions are highly diverse and have a property of quite different distribution from conditional modalities, such as textual descriptors in natural languages, it is hard to learn a probabilistic mapping from the desired conditional modality to the human motion sequences. Besides, the raw motion data from the motion capture system might be redundant in sequences and contain noises; directly modeling the joint distribution over the raw motion sequences and conditional modalities would need a heavy computational overhead and might result in artifacts introduced by the captured noises. To learn a better representation of the various human motion sequences, we first design a powerful Variational AutoEncoder (VAE) and arrive at a representative and low-dimensional latent code for a human motion sequence. Then, instead of using a diffusion model to establish the connections between the raw motion sequences and the conditional inputs, we perform a diffusion process on the motion latent space. Our proposed Motion Latent-based Diffusion model (MLD) could produce vivid motion sequences conforming to the given conditional inputs and substantially reduce the computational overhead in both the training and inference stages. Extensive experiments on various human motion generation tasks demonstrate that our MLD achieves significant improvements over the state-of-the-art methods among extensive human motion generation tasks, with two orders of magnitude faster than previous diffusion models on raw motion sequences.
引用
收藏
页码:18000 / 18010
页数:11
相关论文
共 50 条
  • [31] Face Identity Disentanglement via Latent Space Mapping
    Nitzan, Yotam
    Bermano, Amit
    Li, Yangyan
    Cohen-Or, Daniel
    ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06):
  • [32] Counterfactual Explanation for Regression via Disentanglement in Latent Space
    Zhao, Xuan
    Broelemann, Klaus
    Kasneci, Gjergji
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 976 - 984
  • [33] Latent Space Clustering via Dual Discriminator GAN
    He, Heng-Ping
    Li, Pei-Zhen
    Huang, Ling
    Ji, Yu-Xuan
    Wang, Chang-Dong
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2020), PT I, 2020, 12112 : 671 - 679
  • [34] Latent Space Purification via Neural Density Operators
    Torlai, Giacomo
    Melko, Roger G.
    PHYSICAL REVIEW LETTERS, 2018, 120 (24)
  • [35] Composite Shape Modeling via Latent Space Factorization
    Dubrovina, Anastasia
    Xia, Fei
    Achlioptas, Panos
    Shalah, Mira
    Groscot, Raphael
    Guibas, Leonidas
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8139 - 8148
  • [36] LaMD: Latent Motion Diffusion for Image-Conditional Video Generation
    Hu, Yaosi
    Chen, Zhenzhong
    Luo, Chong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025,
  • [37] BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
    Barquero, German
    Escalera, Sergio
    Palmero, Cristina
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2317 - 2327
  • [38] Social Space Diffusion: Applications of a Latent Space Model to Diffusion with Uncertain Ties (vol 48, pg 258, 2019)
    Fisher, Jacob C.
    SOCIOLOGICAL METHODOLOGY, VOL 49, 2019, 49 : NP1 - NP2
  • [39] Black hole motion in Euclidean space as a diffusion process
    Ropotenko, K.
    PHYSICAL REVIEW D, 2012, 85 (10):
  • [40] Human motion analysis in latent space using clonal selection algorithm
    The 28th Research Institute of China Electronics Technology Group Corporation, Nanjing
    Jiangsu
    210007, China
    Tien Tzu Hsueh Pao, 6 (1101-1107):