Text2Tex: Text-driven Texture Synthesis via Diffusion Models

被引:14
|
作者
Chen, Dave Zhenyu [1 ]
Siddiqui, Yawar [1 ]
Lee, Hsin-Ying [2 ]
Tulyakov, Sergey [2 ]
Niessner, Matthias [1 ]
机构
[1] Tech Univ Munich, Munich, Germany
[2] Snap Res, Santa Monica, CA 90405 USA
关键词
D O I
10.1109/ICCV51070.2023.01701
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. To avoid accumulating inconsistent and stretched artifacts across views, we dynamically segment the rendered view into a generation mask, which represents the generation status of each visible texel. This partitioned view representation guides the depth-aware inpainting model to generate and update partial textures for the corresponding regions. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. Extensive experiments demonstrate that our method significantly outperforms the existing text-driven approaches and GAN-based methods.
引用
收藏
页码:18512 / 18522
页数:11
相关论文
共 50 条
  • [1] CLIPTexture: Text-driven Texture Synthesis
    Song, Yiren
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5468 - 5476
  • [2] TexFit: Text-Driven Fashion Image Editing with Diffusion Models
    Wang, Tongxin
    Ye, Mang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10198 - 10206
  • [3] PaintDiffusion: Towards text-driven painting variation via collaborative diffusion guidance
    Chen, Haibo
    Chen, Zikun
    Zhao, Lei
    Li, Jun
    Yang, Jian
    NEUROCOMPUTING, 2025, 620
  • [4] Text-Driven Chinese Sign Language Synthesis
    徐琳
    高文
    晏洁
    Journal of Harbin Institute of Technology, 1998, (03) : 93 - 98
  • [5] A text-driven sign language synthesis system
    Gao, W
    Xu, L
    Yin, BC
    Liu, Y
    Song, YB
    Yan, J
    Zhou, J
    Chen, HT
    FIFTH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN & COMPUTER GRAPHICS, VOLS 1 AND 2, 1997, : 6 - 11
  • [6] Blended Diffusion for Text-driven Editing of Natural Images
    Avrahami, Omri
    Lischinski, Dani
    Fried, Ohad
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18187 - 18197
  • [7] Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion Models
    Dong, Wenkai
    Xue, Song
    Duan, Xiaoyue
    Han, Shumin
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 7396 - 7406
  • [8] Text-driven human image generation with texture and pose control
    Jin, Zhedong
    Xia, Guiyu
    Yang, Paike
    Wang, Mengxiang
    Sun, Yubao
    Liu, Qingshan
    NEUROCOMPUTING, 2025, 634
  • [9] Text2Performer: Text-Driven Human Video Generation
    Jiang, Yuming
    Yang, Shuai
    Koh, Tong Liang
    Wu, Wayne
    Loy, Chen Change
    Liu, Ziwei
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22690 - 22700
  • [10] Text-Driven Video Prediction
    Song, Xue
    Chen, Jingjing
    Zhu, Bin
    Jiang, Yu-gang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (09)