Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

被引:0
|
作者
Wu, Yiqian [1 ]
Xu, Hao [1 ]
Tang, Xiangjun [1 ]
Chen, Xien [2 ]
Tang, Siyu [3 ]
Zhang, Zhebin [4 ]
Li, Chen [4 ]
Jin, Xiaogang [1 ]
机构
[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China
[2] Yale Univ, New Haven, CT USA
[3] Swiss Fed Inst Technol, Zurich, Switzerland
[4] OPPO US Res Ctr, Menlo Pk, CA USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 04期
基金
中国国家自然科学基金;
关键词
3D portrait generation; 3D-aware GANs; diffusion models;
D O I
10.1145/3658162
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present Portrait3D, a novel neural rendering-based framework with a novel joint geometry-appearance prior to achieve text-to-3D-portrait generation that overcomes the aforementioned issues. To accomplish this, we train a 3D portrait generator, 3DPortraitGAN(sic), as a robust prior. This generator is capable of producing 360 degrees. canonical 3D portraits, serving as a starting point for the subsequent diffusion-based generation process. To mitigate the "grid-like" artifact caused by the high-frequency information in the featuremap-based 3D representation commonly used by most 3D-aware GANs, we integrate a novel pyramid tri-grid 3D representation into 3DPortraitGAN(sic). To generate 3D portraits from text, we first project a randomly generated image aligned with the given prompt into the pre-trained 3DPortraitGAN(sic) 's latent space. The resulting latent code is then used to synthesize a pyramid tri-grid. Beginning with the obtained pyramid tri-grid, we use score distillation sampling to distill the diffusion model's knowledge into the pyramid tri-grid. Following that, we utilize the diffusion model to refine the rendered images of the 3D portrait and then use these refined images as training data to further optimize the pyramid tri-grid, effectively eliminating issues with unrealistic color and unnatural artifacts. Our experimental results show that Portrait3D can produce realistic, high-quality, and canonical 3D portraits that align with the prompt.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks
    Chen Z.
    Xu X.
    Yan Y.
    Pan Y.
    Zhu W.
    Wu W.
    Dai B.
    Yang X.
    IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 1 - 1
  • [2] Towards Implicit Text-Guided 3D Shape Generation
    Liu, Zhengzhe
    Wang, Yi
    Qi, Xiaojuan
    Fu, Chi-Wing
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17875 - 17885
  • [3] EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation
    Liu, Zhengzhe
    Hu, Jingyu
    Hui, Ka-Hei
    Qi, Xiaojuan
    Cohen-Or, Daniel
    Fu, Chi-Wing
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06):
  • [4] DREAMCRAFT: Text-Guided Generation of Functional 3D Environments in Minecraft
    Earle, Sam
    Kokkinos, Filippos
    Nie, Yuhe
    Togelius, Julian
    Raileanu, Roberta
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2024, 2024,
  • [5] TECA: Text-Guided Generation and Editing of Compositional 3D Avatars
    Zhang, Hao
    Feng, Yao
    Kulits, Peter
    Wen, Yandong
    Thies, Justus
    Black, Michael J.
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1520 - 1530
  • [6] Advances in text-guided 3D editing: a survey
    Lu, Lihua
    Li, Ruyang
    Zhang, Xiaohui
    Wei, Hui
    Du, Guoguang
    Wang, Binqiang
    Artificial Intelligence Review, 2024, 57 (12)
  • [7] A Survey of Text-guided 3D Face Reconstruction
    Cen, Mengyue
    Shen, Haoran
    Zhao, Wangyan
    Pan, Dingcheng
    Feng, Xiaoyi
    2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 82 - 87
  • [8] TEXTure: Text-Guided Texturing of 3D Shapes
    Richardson, Elad
    Metzer, Gal
    Alaluf, Yuval
    Giryes, Raja
    Cohen-Or, Daniel
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [9] DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation
    Liu, Zhengzhe
    Dai, Peng
    Li, Ruihui
    Qi, Xiaojuan
    Fu, Chi-Wing
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14385 - 14403
  • [10] Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
    Yu, Cuican
    Lu, Guansong
    Zeng, Yihan
    Sun, Jian
    Liang, Xiaodan
    Li, Huibin
    Xu, Zongben
    Xu, Songcen
    Zhang, Wei
    Xu, Hang
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15280 - 15291