ControlNeRF: Text-Driven 3D Scene Stylization via Diffusion Model

被引:0
|
作者
Chen, Jiahui [1 ]
Yang, Chuanfeng [1 ]
Li, Kaiheng [1 ]
Wu, Qingqiang [1 ]
Hong, Qingqi [1 ]
机构
[1] Xiamen Univ, Dept Digital Media Technol, Xiamen, Peoples R China
关键词
Stylization; Neural Radiance Fields; Diffusion Model; View Synthesis;
D O I
10.1007/978-3-031-72335-3_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D scene stylization aims to generate artistically rendered images from various viewpoints within a 3D space while ensuring style consistency regardless of the viewing angle. Traditional 2D methods usually used in this field struggle with maintaining this consistency when applied to 3D environments. To address this issue, we propose a novel approach named ControlNeRF, which employs a customized conditional diffusion model, ControlNet, and introduces latent variables, obtaining a stylized appearance throughout the scene solely driven by text. Specifically, this text-driven approach effectively overcomes the inconveniences associated with using images as style cues, and it not only achieves a high degree of stylistic consistency across various viewpoints but also produces high-quality images. We have conducted rigorous testing on ControlNeRF with diverse styles, which has confirmed these outcomes. Our approach not only advances the field of 3D scene stylization but also opens new possibilities for artistic expression and digital imaging.
引用
收藏
页码:395 / 406
页数:12
相关论文
共 50 条
  • [21] An Emotional Text-Driven 3D Visual Pronunciation System for Mandarin Chinese
    Yu, Lingyun
    Luo, Changwei
    Yu, Jun
    PATTERN RECOGNITION (CCPR 2016), PT I, 2016, 662 : 93 - 104
  • [22] InterFusion: Text-Driven Generation of 3D Human-Object Interaction
    Dai, Sisi
    Li, Wenhao
    Sun, Haowen
    Huang, Haibin
    Ma, Chongyang
    Huang, Hui
    Xu, Kai
    Hu, Ruizhen
    COMPUTER VISION - ECCV 2024, PT XLVIII, 2025, 15106 : 18 - 35
  • [23] AvatarStudio: Text-driven Editing of 3D Dynamic Human Head Avatars
    Mendiratta, Mohit
    Pan, Xingang
    Elgharib, Mohamed
    Teotia, Kartik
    Mallikarjun, B. R.
    Tewari, Ayush
    Golyanik, Vladislav
    Kortylewski, Adam
    Theobalt, Christian
    ACM TRANSACTIONS ON GRAPHICS, 2023, 42 (06):
  • [24] Text2Tex: Text-driven Texture Synthesis via Diffusion Models
    Chen, Dave Zhenyu
    Siddiqui, Yawar
    Lee, Hsin-Ying
    Tulyakov, Sergey
    Niessner, Matthias
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18512 - 18522
  • [25] Text-driven clothed human image synthesis with 3D human model estimation for assistance in shopping
    S. Karkuzhali
    A. Syed Aasim
    A. StalinRaj
    Multimedia Tools and Applications, 2025, 84 (1) : 167 - 200
  • [26] CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes
    Youwang, Kim
    Ji-Yeon, Kim
    Oh, Tae-Hyun
    COMPUTER VISION - ECCV 2022, PT III, 2022, 13663 : 173 - 191
  • [27] PaintDiffusion: Towards text-driven painting variation via collaborative diffusion guidance
    Chen, Haibo
    Chen, Zikun
    Zhao, Lei
    Li, Jun
    Yang, Jian
    NEUROCOMPUTING, 2025, 620
  • [28] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars
    Hong, Fangzhou
    Zhang, Mingyuan
    Pan, Liang
    Cai, Zhongang
    Yang, Lei
    Liu, Ziwei
    ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
  • [29] HyperStyle3D: Text-Guided 3D Portrait Stylization via Hypernetworks
    Chen, Zhuo
    Xu, Xudong
    Yan, Yichao
    Pan, Ye
    Zhu, Wenhan
    Wu, Wayne
    Dai, Bo
    Yang, Xiaokang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9997 - 10010
  • [30] Scene-Conditional 3D Object Stylization and Composition
    Zhou, Jinghao
    Jakab, Tomas
    Torr, Philip
    Rupprecht, Christian
    COMPUTER VISION - ECCV 2024, PT XXXVII, 2025, 15095 : 289 - 305