Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

被引：1

作者：

Wu, Yiqian ^{[1
]}

Xu, Hao ^{[1
]}

Tang, Xiangjun ^{[1
]}

Chen, Xien ^{[2
]}

Tang, Siyu ^{[3
]}

Zhang, Zhebin ^{[4
]}

Li, Chen ^{[4
]}

Jin, Xiaogang ^{[1
]}

机构：

[1] Zhejiang Univ, State Key Lab CAD&CG, Hangzhou, Peoples R China

[2] Yale Univ, New Haven, CT USA

[3] Swiss Fed Inst Technol, Zurich, Switzerland

[4] OPPO US Res Ctr, Menlo Pk, CA USA

来源：

ACM TRANSACTIONS ON GRAPHICS | 2024年 / 43卷 / 04期

基金：

中国国家自然科学基金;

关键词：

3D portrait generation; 3D-aware GANs; diffusion models;

D O I：

10.1145/3658162

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Existing neural rendering-based text-to-3D-portrait generation methods typically make use of human geometry prior and diffusion models to obtain guidance. However, relying solely on geometry information introduces issues such as the Janus problem, over-saturation, and over-smoothing. We present Portrait3D, a novel neural rendering-based framework with a novel joint geometry-appearance prior to achieve text-to-3D-portrait generation that overcomes the aforementioned issues. To accomplish this, we train a 3D portrait generator, 3DPortraitGAN(sic), as a robust prior. This generator is capable of producing 360 degrees. canonical 3D portraits, serving as a starting point for the subsequent diffusion-based generation process. To mitigate the "grid-like" artifact caused by the high-frequency information in the featuremap-based 3D representation commonly used by most 3D-aware GANs, we integrate a novel pyramid tri-grid 3D representation into 3DPortraitGAN(sic). To generate 3D portraits from text, we first project a randomly generated image aligned with the given prompt into the pre-trained 3DPortraitGAN(sic) 's latent space. The resulting latent code is then used to synthesize a pyramid tri-grid. Beginning with the obtained pyramid tri-grid, we use score distillation sampling to distill the diffusion model's knowledge into the pyramid tri-grid. Following that, we utilize the diffusion model to refine the rendered images of the 3D portrait and then use these refined images as training data to further optimize the pyramid tri-grid, effectively eliminating issues with unrealistic color and unnatural artifacts. Our experimental results show that Portrait3D can produce realistic, high-quality, and canonical 3D portraits that align with the prompt.

引用

页数：12

共 50 条

[31] High-quality Modelling of 3D Point Clouds
Gorkovchuk, Denys
GIM INTERNATIONAL-THE WORLDWIDE MAGAZINE FOR GEOMATICS, 2016, 30 (09): : 44 - 45
[32] Energy-aware Depth Map Generation for 3D Portrait on Android Systems
Kao, Chia-Hui
King, Chung-Ta
Tseng, Shau-Yin
2011 IEEE 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2011, : 959 - 964
[33] An Intuitive System for 3D Avatar with High-quality
Lee, JiHyung
Choi, Yoon-Seok
Koo, Bon-Ki
Hwang, Chi Jung
2010 DIGEST OF TECHNICAL PAPERS INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS ICCE, 2010,
[34] Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior
Chen, Cheng
Yan, Xiaofeng
Yang, Fan
Feng, Chengzeng
Fu, Zhoujie
Foo, Chuan-Sheng
Lin, Guosheng
Liu, Fayao
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10228 - 10237
[35] DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior
Huang, Tianyu
Zeng, Yihan
Zhang, Zhilu
Xu, Wan
Xu, Hang
Xu, Songcen
Lau, Rynson W.H.
Zuo, Wangmeng
arXiv, 2023,
[36] HQ-Avatar: Towards High-Quality 3D Avatar Generation via Point-based Representation
Zhang, Weitian
Wu, Sijing
Yan, Yichao
Xue, Ben
Zhu, Wenhan
Yang, Xiaokang
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
[37] DiffInDScene: Diffusion-based High-Quality 3D Indoor Scene Generation
Ju, Xiaoliang
Huang, Zhaoyang
Li, Yijin
Zhang, Guofeng
Qiao, Yu
Li, Hongsheng
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 4526 - 4535
[38] Joint2Human: High-quality 3D Human Generation via Compact Spherical Embedding of 3D Joints
Zhang, Muxin
Feng, Qiao
Su, Zhuo
Wen, Chao
Xue, Zhou
Li, Kun
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 1429 - 1438
[39] MinD-3D: Reconstruct High-Quality 3D Objects in Human Brain
Gao, Jianxiong
Fu, Yuqian
Wang, Yun
Qian, Xuelin
Feng, Jianfeng
Fu, Yanwei
COMPUTER VISION - ECCV 2024, PT XLVII, 2025, 15105 : 312 - 329
[40] High-quality 3D shape measurement using saturated fringe patterns
Chen, Bo
Zhang, Song
OPTICS AND LASERS IN ENGINEERING, 2016, 87 : 83 - 89

← 1 2 3 4 5 →