AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

被引:0
|
作者
Wang, Xinzhou [1 ,2 ,3 ,4 ]
Wang, Yikai [2 ]
Yee, Junliang [2 ]
Sung, Fuchun [2 ]
Wang, Zhengyi [2 ,3 ]
Wang, Ling [2 ,6 ]
Liu, Pengkun [2 ,7 ]
Sung, Kai [2 ]
Wan, Xintong [8 ]
Xie, Wende [5 ]
Liu, Fangfu [2 ]
He, Bin [1 ]
机构
[1] Tongji Univ, Shanghai, Peoples R China
[2] Tsinghua Univ, Beijing, Peoples R China
[3] ShengShu, Beijing, Peoples R China
[4] Tencent, Shenzhen, Peoples R China
[5] Didi, Beijing, Peoples R China
[6] Xian Res Inst High Tech, Xian, Peoples R China
[7] Fudan Univ, Shanghai, Peoples R China
[8] Zhejiang Univ, Hangzhou, Peoples R China
来源
基金
中国国家自然科学基金; 中国博士后科学基金; 美国国家科学基金会;
关键词
4D generation; Diffusion model; Non-rigid reconstruction;
D O I
10.1007/978-3-031-72698-9_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Advances in 3D generation have facilitated sequential 3D model generation (a.k.a 4D generation), yet its application for animatable objects with large motion remains scarce. Our work proposes AnimatableDreamer, a text-to-4D generation framework capable of generating diverse categories of non-rigid objects on skeletons extracted from a monocular video. At its core, AnimatableDreamer is equipped with our novel optimization design dubbed Canonical Score Distillation (CSD), which lifts 2D diffusion for temporal consistent 4D generation. CSD, designed from a score gradient perspective, generates a canonical model with warp-robustness across different articulations. Notably, it also enhances the authenticity of bones and skinning by integrating inductive priors from a diffusion model. Furthermore, with multi-view distillation, CSD infers invisible regions, thereby improving the fidelity of monocular non-rigid reconstruction. Extensive experiments demonstrate the capability of our method in generating high-flexibility text-guided 3D models from the monocular video, while also showing improved reconstruction performance over existing non-rigid reconstruction methods. Project page https://zz7379.github.io/AnimatableDreamer/.
引用
收藏
页码:321 / 339
页数:19
相关论文
共 50 条
  • [21] HDM-Net: Monocular Non-rigid 3D Reconstruction with Learned Deformation Model
    Golyanik, Vladislav
    Shimada, Soshi
    Varanasi, Kiran
    Stricker, Didier
    VIRTUAL REALITY AND AUGMENTED REALITY, EUROVR 2018, 2018, 11162 : 51 - 72
  • [22] EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation
    Liu, Zhengzhe
    Hu, Jingyu
    Hui, Ka-Hei
    Qi, Xiaojuan
    Cohen-Or, Daniel
    Fu, Chi-Wing
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06):
  • [23] TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision
    Wei, Jiacheng
    Wang, Hao
    Feng, Jiashi
    Lin, Guosheng
    Yap, Kim-Hui
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16805 - 16815
  • [24] 3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation
    Jiang, Zutao
    Lu, Guansong
    Liang, Xiaodan
    Zhu, Jihua
    Zhang, Wei
    Chang, Xiaojun
    Xu, Hang
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 1051 - 1059
  • [25] Skeleton-based canonical forms for non-rigid 3D shape retrieval
    David Pickup
    Xianfang Sun
    Paul L.Rosin
    Ralph R.Martin
    ComputationalVisualMedia, 2016, 2 (03) : 231 - 243
  • [26] Skeleton-based canonical forms for non-rigid 3D shape retrieval
    Pickup D.
    Sun X.
    Rosin P.L.
    Martin R.R.
    Computational Visual Media, 2016, 2 (3) : 231 - 243
  • [27] TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling
    Huo, Dong
    Guo, Zixin
    Zuo, Xinxin
    Shi, Zhihao
    Lu, Juwei
    Dai, Peng
    Xu, Songcen
    Cheng, Li
    Yang, Yee-Hong
    COMPUTER VISION-ECCV 2024, PT XXXVIII, 2025, 15096 : 352 - 368
  • [28] Study on the Optimization Method of Dynamic Reconstruction of 3D Non-Rigid Image
    Wang, Chong
    Li, Ming
    RECENT ADVANCES IN ELECTRICAL & ELECTRONIC ENGINEERING, 2016, 9 (03) : 162 - 166
  • [29] A Survey of Non-Rigid 3D Registration
    Deng, Bailin
    Yao, Yuxin
    Dyke, Roberto M.
    Zhang, Juyong
    COMPUTER GRAPHICS FORUM, 2022, 41 (02) : 559 - 589
  • [30] SobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion
    Slavcheva, Miroslava
    Baust, Maximilian
    Ilic, Slobodan
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 2646 - 2655