Text2Tex: Text-driven Texture Synthesis via Diffusion Models

被引:14
|
作者
Chen, Dave Zhenyu [1 ]
Siddiqui, Yawar [1 ]
Lee, Hsin-Ying [2 ]
Tulyakov, Sergey [2 ]
Niessner, Matthias [1 ]
机构
[1] Tech Univ Munich, Munich, Germany
[2] Snap Res, Santa Monica, CA 90405 USA
关键词
D O I
10.1109/ICCV51070.2023.01701
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. To avoid accumulating inconsistent and stretched artifacts across views, we dynamically segment the rendered view into a generation mask, which represents the generation status of each visible texel. This partitioned view representation guides the depth-aware inpainting model to generate and update partial textures for the corresponding regions. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. Extensive experiments demonstrate that our method significantly outperforms the existing text-driven approaches and GAN-based methods.
引用
收藏
页码:18512 / 18522
页数:11
相关论文
共 50 条
  • [21] The Framework of Text-driven Business Intelligence
    Zhou, Ning
    Cheng, Hongli
    Chen, Hongqin
    Xiao, Shuang
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5468 - 5471
  • [22] Text-Driven Separation of Arbitrary Sounds
    Kilgour, Kevin
    Gfeller, Beat
    Huang, Qingqing
    Jansen, Aren
    Wisdom, Scott
    Tagliasacchi, Marco
    INTERSPEECH 2022, 2022, : 5403 - 5407
  • [23] ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors
    Chen, Jingwen
    Pan, Yingwei
    Yao, Ting
    Mei, Tao
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7540 - 7548
  • [24] DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
    Yang, Bangbang
    Dong, Wenqi
    Ma, Lin
    Hu, Wenbo
    Liu, Xiao
    Cui, Zhaopeng
    Ma, Yuewen
    2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 650 - 660
  • [25] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
    Chai, Wenhao
    Guo, Xun
    Wang, Gaoang
    Lu, Yan
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22983 - 22993
  • [26] Text-driven Synchronized Diffusion Video and Audio Talking Head Generation
    Zhang, Zhenfei
    Huang, Tsung-Wei
    Su, Guan-Ming
    Chang, Ming-Ching
    Li, Xin
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 61 - 67
  • [27] TEXT2VIDEO: TEXT-DRIVEN TALKING-HEAD VIDEO SYNTHESIS WITH PERSONALIZED PHONEME - POSE DICTIONARY
    Zhang, Sibo
    Yuan, Jiahong
    Liao, Miao
    Zhang, Liangjun
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2659 - 2663
  • [28] Text2Video: Text-driven facial animation using MPEG-4
    Rurainsky, J
    Eisert, P
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2005, PTS 1-4, 2005, 5960 : 492 - 500
  • [29] Text2Light. Zero-Shot Text-Driven HDR Panorama Generation
    Chen, Zhaoxi
    Wang, Guangcong
    Liu, Ziwei
    ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (06):
  • [30] LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model
    Sun, Haowen
    Zheng, Ruikun
    Huang, Haibin
    Ma, Chongyang
    Huang, Hui
    Hu, Ruizhen
    PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS, 2024,