FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism

被引:85
|
作者
Chen, Wei [1 ]
Jia, Xi [1 ]
Chang, Hyung Jin [1 ]
Duan, Jinming [1 ]
Shen, Linlin [2 ]
Leonardis, Ales [1 ]
机构
[1] Univ Birmingham, Sch Comp Sci, Birmingham, W Midlands, England
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Comp Vis Inst, Shenzhen, Peoples R China
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/CVPR46437.2021.00163
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we focus on category-level 6D pose and size estimation from a monocular RGB-D image. Previous methods suffer from inefficient category-level pose feature extraction, which leads to low accuracy and inference speed. To tackle this problem, we propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. First, we design an orientation aware autoencoder with 3D graph convolution for latent feature extraction. Thanks to the shift and scale-invariance properties of 3D graph convolution, the learned latent feature is insensitive to point shift and object size. Then, to efficiently decode category-level rotation information from the latent feature, we propose a novel decoupled rotation mechanism that employs two decoders to complementarily access the rotation information. For translation and size, we estimate them by two residuals: the difference between the mean of object points and ground truth translation, and the difference between the mean size of the category and ground truth size, respectively. Finally, to increase the generalization ability of the FS-Net, we propose an online box-cage based 3D deformation mechanism to augment the training data. Extensive experiments on two benchmark datasets show that the proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation. Especially in category-level pose estimation, without extra synthetic data, our method outperforms existing methods by 6:3% on the NOCS-REAL dataset(1).
引用
收藏
页码:1581 / 1590
页数:10
相关论文
共 50 条
  • [31] i2c-net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation
    Remus, Alberto
    D'Avella, Salvatore
    Di Felice, Francesco
    Tripicchio, Paolo
    Avizzano, Carlo Alberto
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (03): : 1515 - 1522
  • [32] Refined Prior Guided Category-Level 6D Pose Estimation and Its Application on Robotic Grasping
    Sun, Huimin
    Zhang, Yilin
    Sun, Honglin
    Hashimoto, Kenji
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [33] Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation
    Tang, Kaifeng
    Xu, Chi
    Chen, Ming
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (17) : 53043 - 53063
  • [34] StereoPose: Category-Level 6D Transparent Object Pose Estimation from Stereo Images via Back-View NOCS
    Chen, Kai
    James, Stephen
    Sui, Congying
    Liu, Yun-Hui
    Abbeel, Pieter
    Dou, Qi
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2855 - 2861
  • [35] Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation
    Kaifeng Tang
    Chi Xu
    Ming Chen
    [J]. Multimedia Tools and Applications, 2024, 83 : 53043 - 53063
  • [36] Synthetic Depth Image-Based Category-Level Object Pose Estimation With Effective Pose Decoupling and Shape Optimization
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [37] Corr-Track: Category-Level 6D Pose Tracking with Soft-Correspondence Matrix Estimation
    Cao, Xin
    Li, Jia
    Zhao, Panpan
    Li, Jiachen
    Qin, Xueying
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (05) : 2173 - 2183
  • [38] CatTrack: Single-Stage Category-Level 6D Object Pose Tracking via Convolution and Vision Transformer
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    Li, Dong
    Zhao, Shiqi
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 1665 - 1680
  • [39] Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image
    Fan, Zhaoxin
    Song, Zhenbo
    Xu, Jian
    Wang, Zhicheng
    Wu, Kejian
    Liu, Hongyan
    He, Jun
    [J]. COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 220 - 236
  • [40] CatTrack: Single-Stage Category-Level 6D Object Pose Tracking via Convolution and Vision Transformer
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    Li, Dong
    Zhao, Shiqi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1665 - 1680