DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaptation by Combining 3D GANs and Diffusion Priors

被引:0
|
作者
Lei, Biwen [1 ]
Yu, Kai [1 ]
Feng, Mengyang [1 ]
Cui, Miaomiao [1 ]
Xie, Xuansong [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
关键词
D O I
10.1109/CVPR52733.2024.00998
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text-guided domain adaptation and generation of 3D-aware portraits find many applications in various fields. However, due to the lack of training data and the challenges in handling the high variety of geometry and appearance, the existing methods for these tasks suffer from issues like inflexibility, instability, and low fidelity. In this paper, we propose a novel framework DiffusionGAN3D, which boosts text-guided 3D domain adaptation and generation by combining 3D GANs and diffusion priors. Specifically, we integrate the pre-trained 3D generative models (e.g., EG3D) and text-to-image diffusion models. The former provides a strong foundation for stable and high-quality avatar generation from text. And the diffusion models in turn offer powerful priors and guide the 3D generator finetuning with informative direction to achieve flexible and efficient text-guided domain adaptation. To enhance the diversity in do-main adaptation and the generation capability in text-to-avatar, we introduce the relative distance loss and case-specific learnable triplane respectively. Besides, we design a progressive texture refinement module to improve the texture quality for both tasks above. Extensive experiments demonstrate that the proposed framework achieves excellent results in both domain adaptation and text-to-avatar tasks, outperforming existing methods in terms of generation quality and efficiency. The project homepage is at https://younglbw.github.io/DiffusionGAN3D-homepage/.
引用
收藏
页码:10487 / 10497
页数:11
相关论文
共 50 条
  • [31] Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
    Yu, Cuican
    Lu, Guansong
    Zeng, Yihan
    Sun, Jian
    Liang, Xiaodan
    Li, Huibin
    Xu, Zongben
    Xu, Songcen
    Zhang, Wei
    Xu, Hang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15280 - 15291
  • [32] Combining 3D Shape and Color for 3D Object Recognition
    Brandao, Susana
    Costeira, Joao P.
    Veloso, Manuela
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2016), 2016, 9730 : 481 - 489
  • [33] DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model
    Kim, Gwanghyun
    Chun, Se Young
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14203 - 14213
  • [34] CPG3D: Cross-Modal Priors Guided 3D Object Reconstruction
    Nie, Weizhi
    Jiao, Chuanqi
    Chang, Rihao
    Qu, Lei
    Liu, An-An
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9383 - 9396
  • [35] Method for registration of 3D shapes without overlap for known 3D priors
    Hu, P.
    Munteanu, A.
    ELECTRONICS LETTERS, 2021, 57 (09) : 357 - 359
  • [36] Diffusion models for 3D generation: A survey
    Wang, Chen
    Peng, Hao-Yang
    Liu, Ying-Tian
    Gu, Jiatao
    Hu, Shi-Min
    COMPUTATIONAL VISUAL MEDIA, 2025, 11 (01): : 1 - 28
  • [37] Equivariant Diffusion for Molecule Generation in 3D
    Hoogeboom, Emiel
    Satorras, Victor Garcia
    Vignac, Clement
    Welling, Max
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [38] Dr.3D: Adapting 3D GANs to Artistic Drawings
    Jin, Wonjoon
    Ryu, Nuri
    Kim, Geonung
    Baek, Seung-Hwan
    Cho, Sunghyun
    PROCEEDINGS SIGGRAPH ASIA 2022, 2022,
  • [39] LatentSwap3D: Semantic Edits on 3D Image GANs
    Simsar, Enis
    Tonioni, Alessio
    Ornek, Evin Pinar
    Tombari, Federico
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 2891 - 2901
  • [40] Neural Wavelet-domain Diffusion for 3D Shape Generation
    Hui, Ka-Hei
    Li, Ruihui
    Hu, Jingyu
    Fu, Chi-Wing
    PROCEEDINGS SIGGRAPH ASIA 2022, 2022,