Depth-aware guidance with self-estimated depth representations of diffusion models

被引:1
|
作者
Kim, Gyeongnyeon [1 ]
Jang, Wooseok [1 ]
Lee, Gyuseong [1 ]
Hong, Susung [1 ]
Seo, Junyoung [1 ]
Kim, Seungryong [1 ]
机构
[1] Korea Univ, Dept Comp Sci, Seoul, South Korea
关键词
Diffusion models; Depth estimation; Diffusion guidance;
D O I
10.1016/j.patcog.2024.110474
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Diffusion models have recently shown significant advancement in the generative models with their impressive fidelity and diversity. The success of these models can be often attributed to their use of sampling guidance techniques, such as classifier or classifier -free guidance, which provide effective mechanisms to trade-off between fidelity and diversity. However, these methods are not capable of guiding a generated image to be aware of its geometric configuration, e.g., depth, which hinders their application to downstream tasks such as scene understanding that require a certain level of depth awareness. To overcome this limitation, we propose a novel sampling guidance method for diffusion models that uses self -predicted depth information derived from the rich intermediate representations of diffusion models. Concretely, we first present a label -efficient depth estimation framework using internal representations of diffusion models. Subsequently, we propose the incorporation of two guidance techniques during the sampling phase. These methods involve using pseudolabeling and depth -domain diffusion prior to self -condition the generated image using the estimated depth map. Experiments and comprehensive ablation studies demonstrate the effectiveness of our method in guiding the diffusion models toward the generation of geometrically plausible images. Our project page is available at https://ku-cvlab.github.io/DAG/.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Learning Depth-Aware Deep Representations for Robotic Perception
    Porzi, Lorenzo
    Bulo, Samuel Rota
    Penate-Sanchez, Adrian
    Ricci, Elisa
    Moreno-Noguer, Francesc
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (02): : 468 - 475
  • [2] Depth-Aware Mirror Segmentation
    Mei, Haiyang
    Dong, Bo
    Dong, Wen
    Peers, Pieter
    Yang, Xin
    Zhang, Qiang
    Wei, Xiaopeng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 3043 - 3052
  • [3] Depth-Aware Motion Magnification
    Kooij, Julian F. P.
    van Gemert, Jan C.
    [J]. COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 467 - 482
  • [4] Depth-Aware Panoptic Segmentation
    Tuan Nguyen
    Mehltretter, Max
    Rottensteiner, Franz
    [J]. ISPRS ANNALS OF THE PHOTOGRAMMETRY, REMOTE SENSING AND SPATIAL INFORMATION SCIENCES: VOLUME X-2-2024, 2024, : 153 - 161
  • [5] Depth-Aware Shadow Removal
    Fu, Yanping
    Gai, Zhenyu
    Zhao, Haifeng
    Zhang, Shaojie
    Shan, Ying
    Wu, Yang
    Tang, Jin
    [J]. COMPUTER GRAPHICS FORUM, 2022, 41 (07) : 455 - 464
  • [6] Depth-Aware Unpaired Video Dehazing
    Yang, Yang
    Guo, Chun-Le
    Guo, Xiaojie
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2388 - 2403
  • [7] Depth-Aware Stereo Video Retargeting
    Li, Bing
    Lin, Chia-Wen
    Shi, Boxin
    Huang, Tiejun
    Gao, Wen
    Kuo, C. -C. Jay
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6517 - 6525
  • [8] Depth-Aware Image Seam Carving
    Shen, Jianbing
    Wang, Dapeng
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (05) : 1453 - 1461
  • [9] Depth-Aware Image Colorization Network
    Chu, Wei-Ta
    Hsu, Yu-Ting
    [J]. PROCEEDINGS OF THE 2018 WORKSHOP ON UNDERSTANDING SUBJECTIVE ATTRIBUTES OF DATA, WITH THE FOCUS ON EVOKED EMOTIONS (EE-USAD'18), 2018, : 17 - 23
  • [10] Depth-Aware Video Frame Interpolation
    Bao, Wenbo
    Lai, Wei-Sheng
    Ma, Chao
    Zhang, Xiaoyun
    Gao, Zhiyong
    Yang, Ming-Hsuan
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3698 - 3707