Vox-E: Text-guided Voxel Editing of 3D Objects

被引:19
|
作者
Sella, Etai [1 ]
Fiebelman, Gal [1 ]
Hedman, Peter [2 ]
Averbuch-Elor, Hadar [1 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] Google Res, New York, NY 10011 USA
关键词
D O I
10.1109/ICCV51070.2023.00046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large scale text-guided diffusion models have garnered significant attention due to their ability to synthesize diverse images that convey complex visual concepts. This generative power has more recently been leveraged to perform text-to-3D synthesis. In this work, we present a technique that harnesses the power of latent diffusion models for editing existing 3D objects. Our method takes oriented 2D images of a 3D object as input and learns a grid-based volumetric representation of it. To guide the volumetric representation to conform to a target text prompt, we follow unconditional text-to-3D methods and optimize a Score Distillation Sampling (SDS) loss. However, we observe that combining this diffusion-guided loss with an image-based regularization loss that encourages the representation not to deviate too strongly from the input object is challenging, as it requires achieving two conflicting goals while viewing only structure-and-appearance coupled 2D projections. Thus, we introduce a novel volumetric regularization loss that operates directly in 3D space, utilizing the explicit nature of our 3D representation to enforce correlation between the global structure of the original and edited object. Furthermore, we present a technique that optimizes cross-attention volumetric grids to refine the spatial extent of the edits. Extensive experiments and comparisons demonstrate the effectiveness of our approach in creating a myriad of edits which cannot be achieved by prior works(1).
引用
收藏
页码:430 / 440
页数:11
相关论文
共 50 条
  • [1] Advances in text-guided 3D editing: a survey
    Lu, Lihua
    Li, Ruyang
    Zhang, Xiaohui
    Wei, Hui
    Du, Guoguang
    Wang, Binqiang
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (12)
  • [2] TECA: Text-Guided Generation and Editing of Compositional 3D Avatars
    Zhang, Hao
    Feng, Yao
    Kulits, Peter
    Wen, Yandong
    Thies, Justus
    Black, Michael J.
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1520 - 1530
  • [3] ClipFace: Text-guided Editing of Textured 3D Morphable Models
    Aneja, Shivangi
    Thies, Justus
    Dai, Angela
    Niessner, Matthias
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [4] TEXTure: Text-Guided Texturing of 3D Shapes
    Richardson, Elad
    Metzer, Gal
    Alaluf, Yuval
    Giryes, Raja
    Cohen-Or, Daniel
    PROCEEDINGS OF SIGGRAPH 2023 CONFERENCE PAPERS, SIGGRAPH 2023, 2023,
  • [5] A Survey of Text-guided 3D Face Reconstruction
    Cen, Mengyue
    Shen, Haoran
    Zhao, Wangyan
    Pan, Dingcheng
    Feng, Xiaoyi
    2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 82 - 87
  • [6] Towards Implicit Text-Guided 3D Shape Generation
    Liu, Zhengzhe
    Wang, Yi
    Qi, Xiaojuan
    Fu, Chi-Wing
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17875 - 17885
  • [7] WordRobe: Text-Guided Generation of Textured 3D Garments
    Srivastava, Astitva
    Manu, Pranav
    Raj, Amit
    Jampani, Varun
    Sharma, Avinash
    COMPUTER VISION-ECCV 2024, PT I, 2025, 15059 : 458 - 475
  • [8] DREAMCRAFT: Text-Guided Generation of Functional 3D Environments in Minecraft
    Earle, Sam
    Kokkinos, Filippos
    Nie, Yuhe
    Togelius, Julian
    Raileanu, Roberta
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON THE FOUNDATIONS OF DIGITAL GAMES, FDG 2024, 2024,
  • [9] HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks
    Chen, Zhuo
    Xu, Xudong
    Yan, Yichao
    Pan, Ye
    Zhu, Wenhan
    Wu, Wayne
    Dai, Bo
    Yang, Xiaokang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9997 - 10010
  • [10] Text-guided 3D Human Generation from 2D Collections
    Fu, Tsu-Jui
    Xiong, Wenhan
    Nie, Yixin
    Liu, Jingyu
    Oguz, Barlas
    Wang, William Yang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 4508 - 4520