Text2Tex: Text-driven Texture Synthesis via Diffusion Models

被引：14

作者：

Chen, Dave Zhenyu ^{[1
]}

Siddiqui, Yawar ^{[1
]}

Lee, Hsin-Ying ^{[2
]}

Tulyakov, Sergey ^{[2
]}

Niessner, Matthias ^{[1
]}

机构：

[1] Tech Univ Munich, Munich, Germany

[2] Snap Res, Santa Monica, CA 90405 USA

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.01701

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. To avoid accumulating inconsistent and stretched artifacts across views, we dynamically segment the rendered view into a generation mask, which represents the generation status of each visible texel. This partitioned view representation guides the depth-aware inpainting model to generate and update partial textures for the corresponding regions. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. Extensive experiments demonstrate that our method significantly outperforms the existing text-driven approaches and GAN-based methods.

引用

页码：18512 / 18522

页数：11

共 50 条

[21] The Framework of Text-driven Business Intelligence
Zhou, Ning
Cheng, Hongli
Chen, Hongqin
Xiao, Shuang
2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5468 - 5471
[22] Text-Driven Separation of Arbitrary Sounds
Kilgour, Kevin
Gfeller, Beat
Huang, Qingqing
Jansen, Aren
Wisdom, Scott
Tagliasacchi, Marco
INTERSPEECH 2022, 2022, : 5403 - 5407
[23] ControlStyle: Text-Driven Stylized Image Generation Using Diffusion Priors
Chen, Jingwen
Pan, Yingwei
Yao, Ting
Mei, Tao
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7540 - 7548
[24] DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation
Yang, Bangbang
Dong, Wenqi
Ma, Lin
Hu, Wenbo
Liu, Xiao
Cui, Zhaopeng
Ma, Yuewen
2024 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES, VR 2024, 2024, : 650 - 660
[25] StableVideo: Text-driven Consistency-aware Diffusion Video Editing
Chai, Wenhao
Guo, Xun
Wang, Gaoang
Lu, Yan
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22983 - 22993
[26] Text-driven Synchronized Diffusion Video and Audio Talking Head Generation
Zhang, Zhenfei
Huang, Tsung-Wei
Su, Guan-Ming
Chang, Ming-Ching
Li, Xin
2024 IEEE 7TH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL, MIPR 2024, 2024, : 61 - 67
[27] TEXT2VIDEO: TEXT-DRIVEN TALKING-HEAD VIDEO SYNTHESIS WITH PERSONALIZED PHONEME - POSE DICTIONARY
Zhang, Sibo
Yuan, Jiahong
Liao, Miao
Zhang, Liangjun
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2659 - 2663
[28] Text2Video: Text-driven facial animation using MPEG-4
Rurainsky, J
Eisert, P
VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2005, PTS 1-4, 2005, 5960 : 492 - 500
[29] Text2Light. Zero-Shot Text-Driven HDR Panorama Generation
Chen, Zhaoxi
Wang, Guangcong
Liu, Ziwei
ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (06):
[30] LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model
Sun, Haowen
Zheng, Ruikun
Huang, Haibin
Ma, Chongyang
Huang, Hui
Hu, Ruizhen
PROCEEDINGS OF SIGGRAPH 2024 CONFERENCE PAPERS, 2024,

← 1 2 3 4 5 →