Neural Scene De-rendering

被引:21
|
作者
Wu, Jiajun [1 ]
Tenenbaum, Joshua B. [1 ]
Kohli, Pushmeet [2 ]
机构
[1] MIT CSAIL, Cambridge, MA 02139 USA
[2] Microsoft Res, Bengaluru, India
关键词
BAYESIAN-INFERENCE; VISION;
D O I
10.1109/CVPR.2017.744
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the problem of holistic scene understanding. We would like to obtain a compact, expressive, and interpretable representation of scenes that encodes information such as the number of objects and their categories, poses, positions, etc. Such a representation would allow us to reason about and even reconstruct or manipulate elements of the scene. Previous works have used encoder-decoder based neural architectures to learn image representations; however, representations obtained in this way are typically uninterpretable, or only explain a single object in the scene. In this work, we propose a new approach to learn an interpretable distributed representation of scenes. Our approach employs a deterministic rendering function as the decoder, mapping a naturally structured and disentangled scene description, which we named scene XML, to an image. By doing so, the encoder is forced to perform the inverse of the rendering operation (a.k.a. de-rendering) to transform an input image to the structured scene XML that the decoder used to produce the image. We use a object proposal based encoder that is trained by minimizing both the supervised prediction and the unsupervised reconstruction errors. Experiments demonstrate that our approach works well on scene de-rendering with two different graphics engines, and our learned representation can be easily adapted for a wide range of applications like image editing, inpainting, visual analogy-making, and image captioning.
引用
收藏
页码:7035 / 7043
页数:9
相关论文
共 50 条
  • [1] De-rendering Stylized Texts
    Shimoda, Wataru
    Haraguchi, Daichi
    Uchida, Seiichi
    Yamaguchi, Kota
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1056 - 1065
  • [2] De-rendering the World's Revolutionary Artefacts
    Wu, Shangzhe
    Makadia, Ameesh
    Wu, Jiajun
    Snavely, Noah
    Tucker, Richard
    Kanazawa, Angjoo
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 6334 - 6343
  • [3] De-rendering 3D Objects in the Wild
    Wimbauer, Felix
    Wu, Shangzhe
    Rupprecht, Christian
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18469 - 18478
  • [4] From Pixels to Physics: Probabilistic Color De-rendering
    Xiong, Ying
    Saenko, Kate
    Darrell, Trevor
    Zickler, Todd
    [J]. 2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 358 - 365
  • [5] COLOR DE-RENDERING USING COUPLED DICTIONARY LEARNING
    Rushdi, Muhammad
    Ali, Mohsen
    Ho, Jeffrey
    [J]. 2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 315 - 319
  • [6] Neural scene representation and rendering
    Eslami, S. M. Ali
    Rezende, Danilo Jimenez
    Besse, Frederic
    Viola, Fabio
    Morcos, Ari S.
    Garnelo, Marta
    Ruderman, Avraham
    Rusu, Andrei A.
    Danihelka, Ivo
    Gregor, Karol
    Reichert, David P.
    Buesing, Lars
    Weber, Theophane
    Vinyals, Oriol
    Rosenbaum, Dan
    Rabinowitz, Neil
    King, Helen
    Hillier, Chloe
    Botvinick, Matt
    Wierstra, Daan
    Kavukcuoglu, Koray
    Hassabis, Demis
    [J]. SCIENCE, 2018, 360 (6394) : 1204 - +
  • [7] Neural Scene Graph Rendering
    Granskog, Jonathan
    Schnabel, Till N.
    Rousselle, Fabrice
    Novak, Jan
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (04):
  • [8] Scalable Neural Indoor Scene Rendering
    Wu, Xiuchao
    Xu, Jiamin
    Zhu, Zihan
    Bao, Hujun
    Huang, Qixing
    Tompkin, James
    Xu, Weiwei
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2022, 41 (04):
  • [9] Learning sRGB-to-Raw-RGB De-rendering with Content-Aware Metadata
    Nam, Seonghyeon
    Punnappurath, Abhijith
    Brubaker, Marcus A.
    Brown, Michael S.
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17683 - 17692
  • [10] Neural Inverse Rendering of an Indoor Scene From a Single Image
    Sengupta, Soumyadip
    Gu, Jinwei
    Kim, Kihwan
    Liu, Guilin
    Jacobs, David W.
    Kautz, Jan
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8597 - 8606