Machine learning for scene 3D reconstruction using a single image

被引：1

作者：

Knyaz, Vladimir ^{[1
,2
]}

机构：

[1] State Res Inst Aviat Syst GosNIIAS, 7 Victorenko Str, Moscow, Russia

[2] Moscow Inst Phys & Technol MIPT, Moscow, Russia

来源：

OPTICS, PHOTONICS AND DIGITAL TECHNOLOGIES FOR IMAGING APPLICATIONS VI | 2021年 / 11353卷

基金：

俄罗斯基础研究基金会;

关键词：

image analysis; scene 3D reconstruction; voxel 3D model; deep learning; convolutional neural network; generative adversarial neural network; dataset;

D O I：

10.1117/12.2556122

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

Image-based scene 3D reconstruction is one of the key tasks for many machine vision applications such as scene understanding, object pose estimation, autonomous navigation. A set of reliable and accurate methods for multi-view scene 3D reconstruction has been developed last decades. But a significant drawback of such 3D reconstruction technique is the need for acquiring a large number of images in the processed sequence to obtain an acceptable 3D scene representation. Recently modern convolutional neural network (CNN) models achieve the best quality for object recognition, image segmentation, image translation and some other challenging computer vision problems. The paper proposes a convolutional neural network architecture and a technique for training data preparation which provide a prediction of voxel model of a 3D scene with several objects. For CNN training and evaluation a special dataset was collected and annotated. It contains image sequences of several scenes and corresponding depth images and 3D models of these scenes. The image sequence serves as the primary data used for further scene 3D reconstruction by SfM technique. Structure from Motion processing results in surface 3D models of all objects in the scene and camera positions and orientation for every image in a sequence. Then surface 3D model is transformed into voxel 3D model and segmented into separate objects. Conditional generative adversarial network architecture was developed for 3D reconstruction by single image. Its generative part translates an input color image into an output voxel model. The discriminative part distinguishes the correct output (close to real voxel model) from false output (wrong output voxel model). Both parts are trained simultaneously on the prepared dataset. Evaluation on the testing part of the prepared dataset has demonstrated the ability of prediction 3D models of previously unobserved complex scenes containing several objects. The proposed neural network architecture provides high generalization ability and improved resolution of predicted voxel 3D models.

引用

页数：10

共 50 条

[1] 3D Digital Image Virtual Scene Reconstruction Algorithm Based on Machine Learning
Xie, Yiyi
International Journal for Engineering Modelling, 2024, 37 (02) : 23 - 40
[2] Panoptic 3D Scene Reconstruction From a Single RGB Image
Dahnert, Manuel
Hou, Ji
Niessner, Matthias
Dai, Angela
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[3] Learning to Recover 3D Scene Shape from a Single Image
Yin, Wei
Zhang, Jianming
Wang, Oliver
Niklaus, Simon
Mai, Long
Chen, Simon
Shen, Chunhua
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 204 - 213
[4] Stage-Based 3D Scene Reconstruction from Single Image
Liu, Yixian
Hao, Pengwei
Izquierdo, Ebroul
2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1034 - 1037
[5] Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
Huang, Siyuan
Qi, Siyuan
Zhu, Yixin
Xiao, Yinxue
Xu, Yuanlu
Zhu, Song-Chun
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 194 - 211
[6] 3D scene reconstruction using Kinect
Morana, M. (marco.morana@unipa.it), 1600, Springer Verlag (260):
[7] Learning 3D Scene Semantics and Structure from a Single Depth Image
Yang, Bo
Lai, Zihang
Lu, Xiaoxuan
Lin, Shuyu
Wen, Hongkai
Markham, Andrew
Trigoni, Niki
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 422 - 425
[8] 3D Scene Reconstruction with Sparse LiDAR Data and Monocular Image in Single Frame
Zhong, Yuanxin
Wang, Sijia
Xie, Shichao
Cao, Zhong
Jiang, Kun
Yang, Diange
SAE INTERNATIONAL JOURNAL OF PASSENGER CARS-ELECTRONIC AND ELECTRICAL SYSTEMS, 2018, 11 (01): : 46 - 54
[9] CGAN-Based Forest Scene 3D Reconstruction from a Single Image
Li, Yuan
Kan, Jiangming
FORESTS, 2024, 15 (01):
[10] Towards Accurate Reconstruction of 3D Scene Shape From A Single Monocular Image
Yin, Wei
Zhang, Jianming
Wang, Oliver
Niklaus, Simon
Chen, Simon
Liu, Yifan
Shen, Chunhua
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (05) : 6480 - 6494

← 1 2 3 4 5 →