DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

被引:0
|
作者
Lei, Biwen [1 ]
Yu, Kai [1 ]
Feng, Mengyang [1 ]
Cui, Miaomiao [1 ]
Xie, Xuansong [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
D O I
10.1109/CVPR52733.2024.00998
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-guided domain adaptation and generation of 3D-aware portraits find many applications in various fields. However, due to the lack of training data and the challenges in handling the high variety of geometry and appearance, the existing methods for these tasks suffer from issues like inflexibility, instability, and low fidelity. In this paper, we propose a novel framework DiffusionGAN3D, which boosts text-guided 3D domain adaptation and generation by combining 3D GANs and diffusion priors. Specifically, we integrate the pre-trained 3D generative models (e.g., EG3D) and text-to-image diffusion models. The former provides a strong foundation for stable and high-quality avatar generation from text. And the diffusion models in turn offer powerful priors and guide the 3D generator finetuning with informative direction to achieve flexible and efficient text-guided domain adaptation. To enhance the diversity in do-main adaptation and the generation capability in text-to-avatar, we introduce the relative distance loss and case-specific learnable triplane respectively. Besides, we design a progressive texture refinement module to improve the texture quality for both tasks above. Extensive experiments demonstrate that the proposed framework achieves excellent results in both domain adaptation and text-to-avatar tasks, outperforming existing methods in terms of generation quality and efficiency. The project homepage is at https://younglbw.github.io/DiffusionGAN3D-homepage/.
引用
收藏
页码:10487 / 10497
页数:11
相关论文
共 50 条
  • [21] Text-Guided Graph Neural Networks for Referring 3D Instance Segmentation
    Huang, Pin-Hao
    Lee, Han-Hung
    Chen, Hwann-Tzong
    Liu, Tyng-Luh
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1610 - 1618
  • [22] Vox-E: Text-guided Voxel Editing of 3D Objects
    Sella, Etai
    Fiebelman, Gal
    Hedman, Peter
    Averbuch-Elor, Hadar
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 430 - 440
  • [23] GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models
    Yi, Taoran
    Fang, Jiemin
    Wang, Junjie
    Wu, Guanjun
    Xie, Lingxi
    Zhang, Xiaopeng
    Liu, Wenyu
    Tian, Qi
    Wang, Xinggang
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6796 - 6807
  • [24] Text and Image Guided 3D Avatar Generation and Manipulation
    Canfes, Zehranaz
    Atasoy, M. Furkan
    Dirik, Alara
    Yanardag, Pinar
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4410 - 4420
  • [25] Text2HOI: Text-guided 3D Motion Generation for Hand-Object Interaction
    Cha, Junuk
    Kim, Jihyeon
    Yoon, Jae Shin
    Baek, Seungryul
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 1577 - 1585
  • [26] CLIP-Head: Text-Guided Generation of Textured Neural Parametric 3D Head Models
    Manu, Pranav
    Srivastava, Astitva
    Sharma, Avinash
    PROCEEDINGS SIGGRAPH ASIA 2023 TECHNICAL COMMUNICATIONS, SA TECHNICAL COMMUNICATIONS 2023, 2023,
  • [27] Hierarchical 3D Diffusion Wavelet Shape Priors
    Essafi, Salma
    Langs, Georg
    Paragios, Nikos
    2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 1717 - 1724
  • [28] 3D or not 3D?
    Reidy, Heath
    PROFESSIONAL ENGINEERING, 2009, 22 (13) : 37 - 38
  • [29] 3D or not 3D?
    Rockley, Ted
    NEW SCIENTIST, 2013, 219 (2928) : 31 - 31
  • [30] AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation
    Wang, Xinzhou
    Wang, Yikai
    Yee, Junliang
    Sung, Fuchun
    Wang, Zhengyi
    Wang, Ling
    Liu, Pengkun
    Sung, Kai
    Wan, Xintong
    Xie, Wende
    Liu, Fangfu
    He, Bin
    COMPUTER VISION - ECCV 2024, PT XXV, 2025, 15083 : 321 - 339