Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

被引：1

作者：

Wu, Yiqian ^{[1
]}

Xu, Hao ^{[1
]}

Tang, Xiangjun ^{[1
]}

Chen, Xien ^{[2
]}

Tang, Siyu ^{[3
]}

Zhang, Zhebin ^{[4
]}

Li, Chen ^{[4
]}

Jin, Xiaogang ^{[1
]}

机构：

[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China

[2] Yale Univ, New Haven, CT USA

[3] Swiss Fed Inst Technol, Zurich, Switzerland

[4] OPPO US Res Ctr, Menlo Pk, CA USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 04期

基金：

中国国家自然科学基金;

关键词：

3D portrait generation; 3D-aware GANs; diffusion models;

D O I：

10.1145/3658162

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present Portrait3D, a novel neural rendering-based framework with a novel joint geometry-appearance prior to achieve text-to-3D-portrait generation that overcomes the aforementioned issues. To accomplish this, we train a 3D portrait generator, 3DPortraitGAN(sic), as a robust prior. This generator is capable of producing 360 degrees. canonical 3D portraits, serving as a starting point for the subsequent diffusion-based generation process. To mitigate the "grid-like" artifact caused by the high-frequency information in the featuremap-based 3D representation commonly used by most 3D-aware GANs, we integrate a novel pyramid tri-grid 3D representation into 3DPortraitGAN(sic). To generate 3D portraits from text, we first project a randomly generated image aligned with the given prompt into the pre-trained 3DPortraitGAN(sic) 's latent space. The resulting latent code is then used to synthesize a pyramid tri-grid. Beginning with the obtained pyramid tri-grid, we use score distillation sampling to distill the diffusion model's knowledge into the pyramid tri-grid. Following that, we utilize the diffusion model to refine the rendered images of the 3D portrait and then use these refined images as training data to further optimize the pyramid tri-grid, effectively eliminating issues with unrealistic color and unnatural artifacts. Our experimental results show that Portrait3D can produce realistic, high-quality, and canonical 3D portraits that align with the prompt.

引用

页数：12

共 50 条

[21] CLIP-Head: Text-Guided Generation of Textured Neural Parametric 3D Head Models
Manu, Pranav
Srivastava, Astitva
Sharma, Avinash
PROCEEDINGS SIGGRAPH ASIA 2023 TECHNICAL COMMUNICATIONS, SA TECHNICAL COMMUNICATIONS 2023, 2023,
[22] TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
Cao, Tianshi
Kreis, Karsten
Fidler, Sanja
Sharp, Nicholas
Yin, Kangxue
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 4146 - 4158
[23] Text-Guided Graph Neural Networks for Referring 3D Instance Segmentation
Huang, Pin-Hao
Lee, Han-Hung
Chen, Hwann-Tzong
Liu, Tyng-Luh
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1610 - 1618
[24] Vox-E: Text-guided Voxel Editing of 3D Objects
Sella, Etai
Fiebelman, Gal
Hedman, Peter
Averbuch-Elor, Hadar
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 430 - 440
[25] AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation
Wang, Xinzhou
Wang, Yikai
Yee, Junliang
Sung, Fuchun
Wang, Zhengyi
Wang, Ling
Liu, Pengkun
Sung, Kai
Wan, Xintong
Xie, Wende
Liu, Fangfu
He, Bin
COMPUTER VISION - ECCV 2024, PT XXV, 2025, 15083 : 321 - 339
[26] AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections
Wu, Yue
Xu, Sicheng
Xiang, Jianfeng
Wei, Fangyun
Chen, Qifeng
Yang, Jiaolong
Tong, Xin
PROCEEDINGS OF THE SIGGRAPH ASIA 2023 CONFERENCE PAPERS, 2023,
[27] High-Quality Reconstruction of 3D Model
Liu, Xing-ming
Cai, Tie
Gui, Rong-zhi
Wang, Hui-jing
Liu, Jun-yao
COMPUTER SCIENCE AND TECHNOLOGY (CST2016), 2017, : 946 - 954
[28] Text and Image Guided 3D Avatar Generation and Manipulation
Canfes, Zehranaz
Atasoy, M. Furkan
Dirik, Alara
Yanardag, Pinar
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4410 - 4420
[29] High-quality image matching and automated generation of 3D tree models
Baltsavias, E.
Gruen, A.
Eisenbeiss, H.
Zhang, L.
Waser, L. T.
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2008, 29 (05) : 1243 - 1259
[30] AvatarVerse: High-Quality & Stable 3D Avatar Creation from Text and Pose
Zhang, Huichao
Chen, Bowen
Yang, Hao
Qu, Liao
Wang, Xu
Chen, Li
Long, Chao
Zhu, Feida
Du, Daniel
Zheng, Min
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7124 - 7132

← 1 2 3 4 5 →