3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models

被引:5
|
作者
Yang, Haibo [1 ]
Chen, Yang [2 ]
Pan, Yingwei [2 ]
Yao, Ting [3 ]
Chen, Zhineng [1 ]
Mei, Tao [3 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
[2] Univ Sci & Technol China, Hefei, Peoples R China
[3] HiDream Ai Inc, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Text-driven 3D Stylization; Diffusion Model; Depth;
D O I
10.1145/3581783.3612363
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community. Recent advances of cross-modal foundation models (e.g., CLIP) have made this problem feasible. Those approaches commonly leverage CLIP to align the holistic semantics of stylized mesh with the given text prompt. Nevertheless, it is not trivial to enable more controllable stylization of fine-grained details in 3D meshes solely based on such semantic-level cross-modal supervision. In this work, we propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models. Technically, 3DStyle-Diffusion first parameterizes the texture of 3D mesh into reflectance properties and scene lighting using implicit MLP networks. Meanwhile, an accurate depth map of each sampled view is achieved conditioned on 3D mesh. Then, 3DStyle-Diffusion leverages a pretrained controllable 2D Diffusion model to guide the learning of rendered images, encouraging the synthesized image of each view semantically aligned with text prompt and geometrically consistent with depth map. This way elegantly integrates both image rendering via implicit MLP networks and diffusion process of image synthesis in an end-to-end fashion, enabling a high-quality fine-grained stylization of 3D meshes. We also build a new dataset derived from Objaverse and the evaluation protocol for this task. Through both qualitative and quantitative experiments, we validate the capability of our 3DStyle-Diffusion. Source code and data are available at https://github.com/yanghb22- fdu/3DStyle- Diffusion-Official.
引用
收藏
页码:6860 / 6868
页数:9
相关论文
共 50 条
  • [21] MaPa: Text-driven Photorealistic Material Painting for 3D Shapes
    Zhang, Shangzhan
    Peng, Sida
    Xu, Tao
    Yang, Yuanbo
    Chen, Tianrun
    Xue, Nan
    Shen, Yujun
    Bao, Hujun
    Hu, Ruizhen
    Zhou, Xiaowei
    PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS, 2024,
  • [22] DreamEditor: Text-Driven 3D Scene Editing with Neural Fields
    Zhuang, Jingyu
    Wang, Chen
    Lin, Liang
    Liu, Lingjie
    Li, Guanbin
    PROCEEDINGS OF THE SIGGRAPH ASIA 2023 CONFERENCE PAPERS, 2023,
  • [23] Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
    TTI-Chicago, United States
    不详
    Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit, 1600, (12619-12629):
  • [24] Improving 3D Imaging with Pre-Trained Perpendicular 2D Diffusion Models
    Lee, Suhyeon
    Chung, Hyungjin
    Park, Minyoung
    Park, Jonghyuk
    Ryu, Wi-Sun
    Ye, Jong Chul
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10676 - 10686
  • [25] Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
    Wang, Haochen
    Du, Xiaodan
    Li, Jiahao
    Yeh, Raymond A.
    Shakhnarovich, Greg
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 12619 - 12629
  • [26] Text-driven 3D Avatar Animation with Emotional and Expressive Behaviors
    Hu, Li
    Qi, Jinwei
    Zhang, Bang
    Pan, Pan
    Xu, Yinghui
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2816 - 2818
  • [27] MLPG refinement techniques for 2D and 3D diffusion problems
    Mazzia, Annamaria
    Pini, Giorgio
    Sartoretto, Flavio
    CMES - Computer Modeling in Engineering and Sciences, 2014, 102 (06): : 475 - 497
  • [28] 3D hydrodynamic interactions lead to divergences in 2D diffusion
    Bleibel, Johannes
    Dominguez, Alvaro
    Oettel, Martin
    JOURNAL OF PHYSICS-CONDENSED MATTER, 2015, 27 (19)
  • [29] MLPG Refinement Techniques for 2D and 3D Diffusion Problems
    Mazzia, Annamaria
    Pini, Giorgio
    Sartoretto, Flavio
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2014, 102 (06): : 475 - 497
  • [30] Vertically-Composed Fine-Grained 3D CMOS
    Li, Mingyu
    Shi, Jiajun
    Rahman, Mostafizur
    Khasanvis, Santosh
    Bhat, Sachin
    Moritz, Csaba Andras
    2017 IEEE SOI-3D-SUBTHRESHOLD MICROELECTRONICS TECHNOLOGY UNIFIED CONFERENCE (S3S), 2017,