Cross-Modal 360° Depth Completion and Reconstruction for Large-Scale Indoor Environment

被引:14
|
作者
Liu, Ruyu [1 ]
Zhang, Guodao [2 ]
Wang, Jiangming [3 ]
Zhao, Shuwen [4 ]
机构
[1] Hangzhou Normal Univ, Sch Informat Sci & Technol, Hangzhou 311121, Peoples R China
[2] Zhejiang Univ Technol, Coll Comp Sci, Hangzhou 310023, Peoples R China
[3] East China Normal Univ, Coll Comp Sci & Technol, Inst Comp Vis, Shanghai 200062, Peoples R China
[4] Univ Portsmouth, Sch Comp, Intelligent Syst & Biomed Robot Grp, Portsmouth PO1 3HE, Hants, England
关键词
Cameras; Three-dimensional displays; Image reconstruction; Task analysis; Simultaneous localization and mapping; Kernel; Estimation; Omnidirectional perception; cross-modal fusion; depth completion; dense reconstruction; VISUAL ODOMETRY; PREDICTION; MODEL;
D O I
10.1109/TITS.2022.3155925
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In a large-scale epidemic, reducing direct contact among medical personnel, attendants and patients has become a necessary means of epidemic prevention and control. Intelligent vehicles and mobile robots in the hospital environment, such as disinfection vehicles, logistics vehicles, nursing robots, and guiding robots, play an important role in improving the operational efficiency of the medical system and promoting epidemic prevention and governance. Powerful capabilities of environmental spatial perception and reconstruction are the keys to accurate localization, navigation, and obstacle avoidance for intelligent vehicles and autonomous robots in such operations. Omnidirectional perception is becoming increasingly important and proliferative in autonomous vehicles and robots since its wide field of view significantly enhances the perception ability. However, the lack of dense and accurate 360 degrees depth datasets has brought the challenge to the omnidirectional perception. In this paper, we propose a depth-sensing and reconstruction system to address this challenge in the large-scale indoor environment. First, we design an omnidirectional depth completion convolutional neural network model, in which a spherical normalized convolutional and the unit sphere area-based loss are introduced to extract features from cross-modal omnidirectional input with unequal sparsity and deal with the imbalanced data distribution and distortion in the panoramic input. In addition, we present a 3D reconstruction system by integrating our depth completion into omnidirectional localization and dense mapping. We evaluate our method on 360D large-scale indoor datasets and real-world sequences of a challenging hospital scene. Extensive experiments show that the proposed method outperforms the other state-of-the-art (SoTA) approaches in terms of depth completion and 3D reconstruction.
引用
收藏
页码:25180 / 25190
页数:11
相关论文
共 50 条
  • [1] Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval
    Li, Mingyong
    Wang, Hongya
    [J]. PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 183 - 191
  • [2] CCMB: A Large-scale Chinese Cross-modal Benchmark
    Xie, Chunyu
    Cai, Heng
    Li, Jincheng
    Kong, Fanjing
    Wu, Xiaoyu
    Song, Jianfei
    Morimitsu, Henrique
    Yao, Lin
    Wang, Dexin
    Zhang, Xiangzheng
    Leng, Dawei
    Zhang, Baochang
    Ji, Xiangyang
    Deng, Yafeng
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4219 - 4227
  • [3] Large-Scale Supervised Hashing for Cross-Modal Retreival
    Karbil, Loubna
    Daoudi, Imane
    [J]. 2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 803 - 808
  • [4] CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-Scale Indoor Scene
    Chen, Hao-Xiang
    Huang, Jiahui
    Mu, Tai-Jiang
    Hu, Shi-Min
    [J]. COMPUTER VISION - ECCV 2022, PT XXXII, 2022, 13692 : 506 - 522
  • [5] Structure-Aware Cross-Modal Transformer for Depth Completion
    Zhao, Linqing
    Wei, Yi
    Li, Jiaxin
    Zhou, Jie
    Lu, Jiwen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1016 - 1031
  • [6] Efficient discrete supervised hashing for large-scale cross-modal retrieval
    Yao, Tao
    Han, Yaru
    Wang, Ruxin
    Kong, Xiangwei
    Yan, Lianshan
    Fu, Haiyan
    Tian, Qi
    [J]. NEUROCOMPUTING, 2020, 385 (385) : 358 - 367
  • [7] Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval
    Wang, Di
    Gao, Xinbo
    Wang, Xiumei
    He, Lihuo
    Yuan, Bo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (10) : 4540 - 4554
  • [8] Online Adaptive Supervised Hashing for Large-Scale Cross-Modal Retrieval
    Su, Ruoqi
    Wang, Di
    Huang, Zhen
    Liu, Yuan
    An, Yaqiang
    [J]. IEEE ACCESS, 2020, 8 : 206360 - 206370
  • [9] Label guided correlation hashing for large-scale cross-modal retrieval
    Dong, Guohua
    Zhang, Xiang
    Lan, Long
    Wang, Shiwei
    Luo, Zhigang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (21) : 30895 - 30922
  • [10] Label guided correlation hashing for large-scale cross-modal retrieval
    Guohua Dong
    Xiang Zhang
    Long Lan
    Shiwei Wang
    Zhigang Luo
    [J]. Multimedia Tools and Applications, 2019, 78 : 30895 - 30922