Anytime 3D Object Reconstruction Using Multi-Modal Variational Autoencoder

被引:0
|
作者
Yu, Hyeonwoo [1 ,2 ]
Oh, Jean [2 ]
机构
[1] Ulsan Natl Inst Sci & Technol UNIST, Sch Elect & Comp Engn, Ulsan, South Korea
[2] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
关键词
3D object reconstruction; multi-modal variational autoencoder; anytime algorithm; data imputation;
D O I
10.1109/LRA.2022.3142439
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
For effective human-robot teaming, it is important for the robots to be able to share their visual perception with the human operators. In a harsh remote collaboration setting, data compression techniques such as autoencoder can be utilized to obtain and transmit the data in terms of latent variables in a compact form. In addition, to ensure real-time runtime performance even under unstable environments, an anytime estimation approach is desired that can reconstruct the full contents from incomplete information. In this context, we propose a method for imputation of latent variables whose elements are partially lost. To achieve the anytime property with only a few dimensions of variables, exploiting prior information of the category-level is essential. A prior distribution used in variational autoencoders is simply assumed to be isotropic Gaussian regardless of the labels of each training datapoint. This type of flattened prior makes it difficult to perform imputation from the category-level distributions. We overcome this limitation by exploiting a category-specific multi-modal prior distribution in the latent space. The missing elements of the partially transferred data can be sampled, by finding a specific modal according to the remaining elements. Since the method is designed to use partial elements for anytime estimation, it can also be applied for data over-compression. Based on the experiments on the ModelNet and Pascal3D datasets, the proposed approach shows consistently superior performance over autoencoder and variational autoencoder up to 70% data loss. The software is open source and is available from our repository(1).
引用
收藏
页码:2162 / 2169
页数:8
相关论文
共 50 条
  • [1] Multi-Modal Streaming 3D Object Detection
    Abdelfattah, Mazen
    Yuan, Kaiwen
    Wang, Z. Jane
    Ward, Rabab
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6163 - 6170
  • [2] Multi-Modal 3D Object Detection by Box Matching
    Liu, Zhe
    Ye, Xiaoqing
    Zou, Zhikang
    He, Xinwei
    Tan, Xiao
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [3] Incremental Dense Multi-modal 3D Scene Reconstruction
    Miksik, Ondrej
    Amar, Yousef
    Vineet, Vibhav
    Perez, Patrick
    Torr, Philip H. S.
    [J]. 2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 908 - 915
  • [4] Teaching robots to do object assembly using multi-modal 3D vision
    Wan, Weiwei
    Lu, Feng
    Wu, Zepei
    Harada, Kensuke
    [J]. NEUROCOMPUTING, 2017, 259 : 85 - 93
  • [5] Quantization to accelerate inference in multi-modal 3D object detection
    Geerhart, Billy
    Dasari, Venkat R.
    Rapp, Brian
    Wang, Peng
    Wang, Ju
    Payne, Christopher X.
    [J]. DISRUPTIVE TECHNOLOGIES IN INFORMATION SCIENCES VIII, 2024, 13058
  • [6] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Wang, Yingjie
    Mao, Qiuyu
    Zhu, Hanqi
    Deng, Jiajun
    Zhang, Yu
    Ji, Jianmin
    Li, Houqiang
    Zhang, Yanyong
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 2122 - 2152
  • [7] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Yingjie Wang
    Qiuyu Mao
    Hanqi Zhu
    Jiajun Deng
    Yu Zhang
    Jianmin Ji
    Houqiang Li
    Yanyong Zhang
    [J]. International Journal of Computer Vision, 2023, 131 : 2122 - 2152
  • [8] ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion
    Cai, Qi
    Pan, Yingwei
    Yao, Ting
    Ngo, Chong-Wah
    Mei, Tao
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18021 - 18030
  • [9] Human Emotion Estimation Using Multi-Modal Variational AutoEncoder with Time Changes
    Moroto, Yuya
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2021 IEEE 3RD GLOBAL CONFERENCE ON LIFE SCIENCES AND TECHNOLOGIES (IEEE LIFETECH 2021), 2021, : 67 - 68
  • [10] Jointly Trained Variational Autoencoder for Multi-Modal Sensor Fusion
    Korthals, Timo
    Hesse, Marc
    Leitner, Juergen
    Melnik, Andrew
    Rueckert, Ulrich
    [J]. 2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,