IRISformer: Dense Vision Transformers for Single-Image Inverse Rendering in Indoor Scenes

被引:32
|
作者
Zhu, Rui [1 ]
Li, Zhengqin [1 ]
Matai, Janarbek [2 ]
Porikli, Fatih [2 ]
Chandraker, Manmohan [1 ]
机构
[1] Univ Calif San Diego, La Jolla, CA 92093 USA
[2] Qualcomm AI Res, San Diego, CA USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/CVPR52688.2022.00284
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Indoor scenes exhibit significant appearance variations due to myriad interactions between arbitrarily diverse object shapes, spatially-changing materials, and complex lighting. Shadows, highlights, and inter-reflections caused by visible and invisible light sources require reasoning about long-range interactions for inverse rendering, which seeks to recover the components of image formation, namely, shape, material, and lighting. In this work, our intuition is that the long-range attention learned by transformer architectures is ideally suited to solve longstanding challenges in single-image inverse rendering. We demonstrate with a specific instantiation of a dense vision transformer, IRISformer, that excels at both single-task and multi-task reasoning required for inverse rendering. Specifically, we propose a transformer architecture to simultaneously estimate depths, normals, spatially-varying albedo, roughness and lighting from a single image of an indoor scene. Our extensive evaluations on benchmark datasets demonstrate state-of-the-art results on each of the above tasks, enabling applications like object insertion and material editing in a single unconstrained real image, with greater photorealism than prior works. Code and data are publicly released.(1)
引用
收藏
页码:2812 / 2821
页数:10
相关论文
共 45 条
  • [1] Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF from a Single Image
    Li, Zhengqin
    Shafiei, Mohammad
    Ramamoorthi, Ravi
    Sunkavalli, Kalyan
    Chandraker, Manmohan
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2472 - 2481
  • [2] Neural Inverse Rendering of an Indoor Scene From a Single Image
    Sengupta, Soumyadip
    Gu, Jinwei
    Kim, Kihwan
    Liu, Guilin
    Jacobs, David W.
    Kautz, Jan
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8597 - 8606
  • [3] Automatic single-image 3D reconst ruct ions of indoor Manhattan world scenes
    Delage, Erick
    Lee, Honglak
    Ng, Andrew Y.
    [J]. ROBOTICS RESEARCH, 2007, 28 : 305 - +
  • [4] Inverse rendering from a single image
    Boivin, S
    [J]. CGIV'2002: FIRST EUROPEAN CONFERENCE ON COLOUR IN GRAPHICS, IMAGING, AND VISION, CONFERENCE PROCEEDINGS, 2002, : 268 - 277
  • [5] Vision Transformers for Single Image Dehazing
    Song, Yuda
    He, Zhuqing
    Qian, Hui
    Du, Xin
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1927 - 1941
  • [6] Stacked dense networks for single-image snow removal
    Li, Pengyue
    Yun, Mengshen
    Tian, Jiandong
    Tang, Yandong
    Wang, Guolin
    Wu, Chengdong A.
    [J]. NEUROCOMPUTING, 2019, 367 : 152 - 163
  • [7] Recursive modified dense network for single-image deraining
    Chai, Guoqiang
    Wang, Zhaoba
    Guo, Guodong
    Chen, Youxing
    Jin, Yong
    Wang, Wei
    Zhao, Xia
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (03)
  • [8] InverseRenderNet: Learning single image inverse rendering
    Yu, Ye
    Smith, William A. P.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3150 - 3159
  • [9] Single-Image SVBRDF Capture with a Rendering-Aware Deep Network
    Deschaintre, Valentin
    Aittala, Miika
    Durand, Fredo
    Drettakis, George
    Bousseau, Adrien
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (04):
  • [10] HMD-Guided Image-Based Modeling and Rendering of Indoor Scenes
    Andersen, Daniel
    Popescu, Voicu
    [J]. VIRTUAL REALITY AND AUGMENTED REALITY, EUROVR 2018, 2018, 11162 : 73 - 93