VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations

被引:3
|
作者
Lin, Jiehong [1 ,2 ]
Wei, Zewei [1 ]
Zhang, Yabin [3 ]
Jia, Kui [1 ]
机构
[1] South China Univ Technol, Guangzhou, Peoples R China
[2] DexForce Technol Co Ltd, Shenzhen, Peoples R China
[3] Hong Kong Polytech Univ, Hong Kong, Peoples R China
关键词
D O I
10.1109/ICCV51070.2023.01287
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rotation estimation of high precision from an RGB-D object observation is a huge challenge in 6D object pose estimation, due to the difficulty of learning in the non-linear space of SO(3). In this paper, we propose a novel rotation estimation network, termed as VI- Net, to make the task easier by decoupling the rotation as the combination of a viewpoint rotation and an in-plane rotation. More specifically, VI-Net bases the feature learning on the sphere with two individual branches for the estimates of two factorized rotations, where a V-Branch is employed to learn the viewpoint rotation via binary classification on the spherical signals, while another I-Branch is used to estimate the in-plane rotation by transforming the signals to view from the zenith direction. To process the spherical signals, a Spherical Feature Pyramid Network is constructed based on a novel design of SPAtial Spherical Convolution (SPA-SConv), which settles the boundary problem of spherical signals via feature padding and realizes viewpoint-equivariant feature extraction by symmetric convolutional operations. We apply the proposed VI-Net to the challenging task of category-level 6D object pose estimation for predicting the poses of unknown objects without available CAD models; experiments on the benchmarking datasets confirm the efficacy of our method, which outperforms the existing ones with a large margin in the regime of high precision.
引用
收藏
页码:13955 / 13965
页数:11
相关论文
共 50 条
  • [1] An efficient network for category-level 6D object pose estimation
    Sun, Shantong
    Liu, Rongke
    Sun, Shuqiao
    Yang, Xinxin
    Lu, Guangshan
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (07) : 1643 - 1651
  • [2] CatFormer: Category-Level 6D Object Pose Estimation with Transformer
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6808 - 6816
  • [3] RANSAC Optimization for Category-level 6D Object Pose Estimation
    Chen, Ying
    Kang, Guixia
    Wang, Yiping
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 50 - 56
  • [4] An efficient network for category-level 6D object pose estimation
    Shantong Sun
    Rongke Liu
    Shuqiao Sun
    Xinxin Yang
    Guangshan Lu
    [J]. Signal, Image and Video Processing, 2021, 15 : 1643 - 1651
  • [5] Category-Level 6D Object Pose Estimation via Cascaded Relation and Recurrent Reconstruction Networks
    Wang, Jiaze
    Chen, Kai
    Dou, Qi
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4807 - 4814
  • [6] Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation
    Wang, He
    Sridhar, Srinath
    Huang, Jingwei
    Valentin, Julien
    Song, Shuran
    Guibas, Leonidas J.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2637 - 2646
  • [7] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism
    Chen, Wei
    Jia, Xi
    Chang, Hyung Jin
    Duan, Jinming
    Shen, Linlin
    Leonardis, Ales
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1581 - 1590
  • [8] Adversarial imitation learning-based network for category-level 6D object pose estimation
    Sun, Shantong
    Bao, Xu
    Kaushik, Aryan
    [J]. MACHINE VISION AND APPLICATIONS, 2024, 35 (05)
  • [9] Category-Level 6D Object Pose Estimation With Structure Encoder and Reasoning Attention
    Liu, Jierui
    Cao, Zhiqiang
    Tang, Yingbo
    Liu, Xilong
    Tan, Min
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6728 - 6740
  • [10] Category-Level 6D Object Pose Recovery in Depth Images
    Sahin, Caner
    Kim, Tae-Kyun
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT I, 2019, 11129 : 665 - 681