Cross-Dimensional Refined Learning for Real-Time 3D Visual Perception from Monocular Video

被引:0
|
作者
Hong, Ziyang [1 ]
Yue, C. Patrick [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
关键词
D O I
10.1109/ICCVW60793.2023.00231
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel real-time capable learning method that jointly perceives a 3D scene's geometry structure and semantic labels. Recent approaches to real-time 3D scene reconstruction mostly adopt a volumetric scheme, where a Truncated Signed Distance Function (TSDF) is directly regressed. However, these volumetric approaches tend to focus on the global coherence of their reconstructions, which leads to a lack of local geometric detail. To overcome this issue, we propose to leverage the latent geometric prior knowledge in 2D image features by explicit depth prediction and anchored feature generation, to refine the occupancy learning in TSDF volume. Besides, we find that this cross-dimensional feature refinement methodology can also be adopted for the semantic segmentation task by utilizing semantic priors. Hence, we proposed an end-to-end cross-dimensional refinement neural network (CDRNet) to extract both 3D mesh and 3D semantic labeling in real time. The experiment results show that this method achieves a state-of-the-art 3D perception efficiency on multiple datasets, which indicates the great potential of our method for industrial applications.
引用
收藏
页码:2161 / 2170
页数:10
相关论文
共 50 条
  • [1] Real-Time 3D Visual Perception by Cross-Dimensional Refined Learning
    Hong, Ziyang
    Patrick Yue, C.
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 10326 - 10338
  • [2] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video
    Sun, Jiaming
    Xie, Yiming
    Chen, Linghao
    Zhou, Xiaowei
    Bao, Hujun
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15593 - 15602
  • [3] Real-time monocular 3D perception with ORB-Features
    Ji, Babing
    Cao, Qixin
    [J]. INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2018, 45 (06): : 776 - 783
  • [4] NeuralRecon: Real-Time Coherent 3D Scene Reconstruction From Monocular Video
    Chen, Xi
    Sun, Jiaming
    Xie, Yiming
    Bao, Hujun
    Zhou, Xiaowei
    [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (12) : 7542 - 7555
  • [5] Real-Time 3D Pose Reconstruction of Human Body from Monocular Video Sequences
    Zhu, LiangJia
    Hwang, Jenq-Neng
    Chen, Chih-Chang
    Lin, Ming-Hui
    Yen, Chen-Lan
    [J]. ISCAS: 2009 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-5, 2009, : 717 - +
  • [6] Learning and Recognition of 3D Visual Objects in Real-Time
    Hamid, Shihab
    Hengst, Bernhard
    [J]. AI 2009: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2009, 5866 : 150 - 159
  • [7] A real-time 3D video analyzer for enhanced 3D audio–visual systems
    Sangoh Jeong
    Hyun-Soo Kim
    KyuWoon Kim
    Byeong-Moon Jeon
    Joong-Ho Won
    [J]. Multimedia Systems, 2020, 26 : 125 - 137
  • [8] A real-time 3D video analyzer for enhanced 3D audio-visual systems
    Jeong, Sangoh
    Kim, Hyun-Soo
    Kim, KyuWoon
    Jeon, Byeong-Moon
    Won, Joong-Ho
    [J]. MULTIMEDIA SYSTEMS, 2020, 26 (02) : 125 - 137
  • [9] Robust, Real-Time 3D Face Tracking from a Monocular View
    Liao, Wei-Kai
    Fidaleo, Douglas
    Medioni, Gerard
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2010,
  • [10] Robust, Real-Time 3D Face Tracking from a Monocular View
    Wei-Kai Liao
    Douglas Fidaleo
    Gerard Medioni
    [J]. EURASIP Journal on Image and Video Processing, 2010