Generating Diverse and Natural 3D Human Motions from Text

被引:149
|
作者
Guo, Chuan [1 ]
Zou, Shihao [1 ]
Zuo, Xinxin [1 ]
Wang, Sen [1 ]
Ji, Wei [1 ]
Li, Xingyu [1 ]
Cheng, Li [1 ]
机构
[1] Univ Alberta, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/CVPR52688.2022.00509
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Automated generation of 3D human motions from text is a challenging problem. The generated motions are expected to be sufficiently diverse to explore the text-grounded motion space, and more importantly, accurately depicting the content in prescribed text descriptions. Here we tackle this problem with a two-stage approach: text2length sampling and text2motion generation. Text2length involves sampling from the learned distribution function of motion lengths conditioned on the input text. This is followed by our text2motion module using temporal variational autoencoder to synthesize a diverse set of human motions of the sampled lengths. Instead of directly engaging with pose sequences, we propose motion snippet code as our internal motion representation, which captures local semantic motion contexts and is empirically shown to facilitate the generation of plausible motions faithful to the input text. Moreover, a large-scale dataset of scripted 3D Human motions, HumanML3D, is constructed, consisting of 14,616 motion clips and 44,970 text descriptions.
引用
收藏
页码:5142 / 5151
页数:10
相关论文
共 50 条
  • [1] Synthesizing Diverse Human Motions in 3D Indoor Scenes
    Zhao, Kaifeng
    Zhang, Yan
    Wang, Shaofei
    Beeler, Thabo
    Tang, Siyu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14692 - 14703
  • [2] Generating Continual Human Motion in Diverse 3D Scenes
    Mir, Aymen
    Puig, Xavier
    Kanazawa, Angjoo
    Pons-Moll, Gerard
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 903 - 913
  • [3] TEMOS: Generating Diverse Human Motions from Textual Descriptions
    Petrovich, Mathis
    Black, Michael J.
    Varol, Gul
    COMPUTER VISION, ECCV 2022, PT XXII, 2022, 13682 : 480 - 497
  • [4] Generating diverse clothed 3D human animations via a generative model
    Shi, Min
    Feng, Wenke
    Gao, Lin
    Zhu, Dengming
    COMPUTATIONAL VISUAL MEDIA, 2024, 10 (02) : 261 - 277
  • [5] Generating diverse clothed 3D human animations via a generative model
    Min Shi
    Wenke Feng
    Lin Gao
    Dengming Zhu
    Computational Visual Media, 2024, 10 : 261 - 277
  • [6] Generating Various 3D Motions by Emergent Imitation Learning
    Mitsunobu, Ryusei
    Oshima, Chika
    Nakayama, Koichi
    HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION, HIMI 2023, PT I, 2023, 14015 : 516 - 530
  • [7] Generating Diverse 3D Reconstructions from a Single Occluded Face Image
    Dey, Rahul
    Boddeti, Vishnu Naresh
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1537 - 1547
  • [8] Generating Human Interaction Motions in Scenes with Text Control
    Yi, Hongwei
    Thies, Justus
    Black, Michael J.
    Peng, Xue Bin
    Rempel, Davis
    COMPUTER VISION-ECCV 2024, PT IV, 2025, 15062 : 246 - 263
  • [9] Generating Holistic 3D Human Motion from Speech
    Yi, Hongwei
    Liang, Hualin
    Liu, Yifei
    Cao, Qiong
    Wen, Yandong
    Bolkart, Timo
    Tao, Dacheng
    Black, Michael J.
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 469 - 480
  • [10] Efficient Indexing of 3D Human Motions
    Budikova, Petra
    Sedmidubsky, Jan
    Zezula, Pavel
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 10 - 18