Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

被引:84
|
作者
Huang, Siyuan [1 ,2 ]
Qi, Siyuan [1 ,2 ]
Zhu, Yixin [1 ,2 ]
Xiao, Yinxue [1 ]
Xu, Yuanlu [1 ,2 ]
Zhu, Song-Chun [1 ,2 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90024 USA
[2] Int Ctr AI & Robot Auton CARA, Los Angeles, CA 90024 USA
来源
关键词
3D scene parsing and reconstruction; Analysis-by-synthesis; Holistic Scene Grammar; Markov chain Monte Carlo;
D O I
10.1007/978-3-030-01234-2_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set of CAD models using a stochastic grammar model. Specifically, we introduce a Holistic Scene Grammar (HSG) to represent the 3D scene structure, which characterizes a joint distribution over the functional and geometric space of indoor scenes. The proposed HSG captures three essential and often latent dimensions of the indoor scenes: (i) latent human context, describing the affordance and the functionality of a room arrangement, (ii) geometric constraints over the scene configurations, and (iii) physical constraints that guarantee physically plausible parsing and reconstruction. We solve this joint parsing and reconstruction problem in an analysis-by-synthesis fashion, seeking to minimize the differences between the input image and the rendered images generated by our 3D representation, over the space of depth, surface normal, and object segmentation map. The optimal configuration, represented by a parse graph, is inferred using Markov chain Monte Carlo (MCMC), which efficiently traverses through the non-differentiable solution space, jointly optimizing object localization, 3D layout, and hidden human context. Experimental results demonstrate that the proposed algorithm improves the generalization ability and significantly outperforms prior methods on 3D layout estimation, 3D object detection, and holistic scene understanding.
引用
收藏
页码:194 / 211
页数:18
相关论文
共 50 条
  • [1] Panoptic 3D Scene Reconstruction From a Single RGB Image
    Dahnert, Manuel
    Hou, Ji
    Niessner, Matthias
    Dai, Angela
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [2] 3D SCENE RECONSTRUCTION FROM RGB IMAGES
    Rotaru, Razvan-Paul
    Gradinaru, Alexandru
    Moldoveanu, Florica
    [J]. UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2024, 86 (02): : 101 - 112
  • [3] Divide-and-Conquer for Holistic and Expressive 3D Human Body Reconstruction from a Single RGB Image
    Ma, Xiaoliang
    Liu, Baoyu
    Liu, Xunyu
    Wang, Lei
    Huang, Zhenghua
    Cheng, Jun
    [J]. THIRTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2021), 2022, 12083
  • [4] Dense 3D Face Reconstruction from a Single RGB Image
    Mao, Jianxu
    Zhang, Yifeng
    Liu, Caiping
    Tao, Ziming
    Yi, Junfei
    Wang, Yaonan
    [J]. 2022 IEEE 25TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING, CSE, 2022, : 24 - 31
  • [5] Holistic 3D Scene Understanding from a Single Image with Implicit Representation
    Zhang, Cheng
    Cui, Zhaopeng
    Zhang, Yinda
    Zeng, Bing
    Pollefeys, Marc
    Liu, Shuaicheng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8829 - 8838
  • [6] Holistic 3D Body Reconstruction From a Blurred Single Image
    Santoso, Joshua
    Williem
    Park, In Kyu
    [J]. IEEE ACCESS, 2022, 10 : 115399 - 115410
  • [7] 3D-Scene-Former: 3D scene generation from a single RGB image using Transformers
    Chatterjee, Jit
    Vega, Maria Torres
    [J]. VISUAL COMPUTER, 2024,
  • [8] Complete 3D Scene Parsing from an RGBD Image
    Zou, Chuhang
    Guo, Ruiqi
    Li, Zhizhong
    Hoiem, Derek
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (02) : 143 - 162
  • [9] Complete 3D Scene Parsing from an RGBD Image
    Chuhang Zou
    Ruiqi Guo
    Zhizhong Li
    Derek Hoiem
    [J]. International Journal of Computer Vision, 2019, 127 : 143 - 162
  • [10] A 3D Reconstruction System for Large Scene Based on RGB-D Image
    Wang, Hongren
    Wang, Pengbo
    Wang, Xiaodi
    Peng, Tianchen
    Zhang, Baochang
    [J]. INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING, 2018, 11266 : 518 - 527