Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

被引：0

作者：

Wu, Yiqian ^{[1
]}

Xu, Hao ^{[1
]}

Tang, Xiangjun ^{[1
]}

Chen, Xien ^{[2
]}

Tang, Siyu ^{[3
]}

Zhang, Zhebin ^{[4
]}

Li, Chen ^{[4
]}

Jin, Xiaogang ^{[1
]}

机构：

[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China

[2] Yale Univ, New Haven, CT USA

[3] Swiss Fed Inst Technol, Zurich, Switzerland

[4] OPPO US Res Ctr, Menlo Pk, CA USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 04期

基金：

中国国家自然科学基金;

关键词：

3D portrait generation; 3D-aware GANs; diffusion models;

D O I：

10.1145/3658162

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present Portrait3D, a novel neural rendering-based framework with a novel joint geometry-appearance prior to achieve text-to-3D-portrait generation that overcomes the aforementioned issues. To accomplish this, we train a 3D portrait generator, 3DPortraitGAN(sic), as a robust prior. This generator is capable of producing 360 degrees. canonical 3D portraits, serving as a starting point for the subsequent diffusion-based generation process. To mitigate the "grid-like" artifact caused by the high-frequency information in the featuremap-based 3D representation commonly used by most 3D-aware GANs, we integrate a novel pyramid tri-grid 3D representation into 3DPortraitGAN(sic). To generate 3D portraits from text, we first project a randomly generated image aligned with the given prompt into the pre-trained 3DPortraitGAN(sic) 's latent space. The resulting latent code is then used to synthesize a pyramid tri-grid. Beginning with the obtained pyramid tri-grid, we use score distillation sampling to distill the diffusion model's knowledge into the pyramid tri-grid. Following that, we utilize the diffusion model to refine the rendered images of the 3D portrait and then use these refined images as training data to further optimize the pyramid tri-grid, effectively eliminating issues with unrealistic color and unnatural artifacts. Our experimental results show that Portrait3D can produce realistic, high-quality, and canonical 3D portraits that align with the prompt.

引用

页数：12

共 50 条

[1] HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks
Chen Z.
Xu X.
Yan Y.
Pan Y.
Zhu W.
Wu W.
Dai B.
Yang X.
IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 1 - 1
[2] Towards Implicit Text-Guided 3D Shape Generation
Liu, Zhengzhe
Wang, Yi
Qi, Xiaojuan
Fu, Chi-Wing
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17875 - 17885
[3] EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape Generation
Liu, Zhengzhe
Hu, Jingyu
Hui, Ka-Hei
Qi, Xiaojuan
Cohen-Or, Daniel
Fu, Chi-Wing
ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06):
[4] DREAMCRAFT: Text-Guided Generation of Functional 3D Environments in Minecraft
Earle, Sam
Kokkinos, Filippos
Nie, Yuhe
Togelius, Julian
Raileanu, Roberta
PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2024, 2024,
[5] TECA: Text-Guided Generation and Editing of Compositional 3D Avatars
Zhang, Hao
Feng, Yao
Kulits, Peter
Wen, Yandong
Thies, Justus
Black, Michael J.
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1520 - 1530
[6] Advances in text-guided 3D editing: a survey
Lu, Lihua
Li, Ruyang
Zhang, Xiaohui
Wei, Hui
Du, Guoguang
Wang, Binqiang
Artificial Intelligence Review, 2024, 57 (12)
[7] A Survey of Text-guided 3D Face Reconstruction
Cen, Mengyue
Shen, Haoran
Zhao, Wangyan
Pan, Dingcheng
Feng, Xiaoyi
2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 82 - 87
[8] TEXTure: Text-Guided Texturing of 3D Shapes
Richardson, Elad
Metzer, Gal
Alaluf, Yuval
Giryes, Raja
Cohen-Or, Daniel
PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
[9] DreamStone: Image as a Stepping Stone for Text-Guided 3D Shape Generation
Liu, Zhengzhe
Dai, Peng
Li, Ruihui
Qi, Xiaojuan
Fu, Chi-Wing
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 14385 - 14403
[10] Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images
Yu, Cuican
Lu, Guansong
Zeng, Yihan
Sun, Jian
Liang, Xiaodan
Li, Huibin
Xu, Zongben
Xu, Songcen
Zhang, Wei
Xu, Hang
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15280 - 15291

← 1 2 3 4 5 →