Attention-guided RGB-D Fusion Network for Category-level 6D Object Pose Estimation

被引:3
|
作者
Wang, Hao [1 ]
Li, Weiming [1 ]
Kim, Jiyeon [2 ]
Wang, Qiang [1 ]
机构
[1] Samsung Res Ctr, SAIT China Lab, Beijing, Peoples R China
[2] Samsung Adv Inst Technol SAIT, Suwon, South Korea
关键词
D O I
10.1109/IROS47612.2022.9981242
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work focuses on estimating 6D poses and sizes of category-level objects from a single RGB-D image. How to exploit the complementary RGB and depth features plays an important role in this task yet remains an open question. Due to the large intra-category texture and shape variations, an object instance in test may have different RGB and depth features from those of the object instances in training, which poses challenges to previous RGB-D fusion methods. To deal with such problem, an Attention-guided RGB-D Fusion Network (ARF-Net) is proposed in this work. Our key design is an ARF module that learns to adaptively fuse RGB and depth features with guidance from both structure-aware attention and relation-aware attention. Specifically, the structure-aware attention captures spatial relationship among object parts and the relation-aware attention captures the RGB-to-depth correlations between the appearance and geometric features. Our ARF-Net directly establishes canonical correspondences with a compact decoder based on the multi-modal features from our ARF module. Extensive experiments show that our method can effectively fuse RGB features to various popular point cloud encoders and provide consistent performance improvement. In particular, without reconstructing instance 3D models, our method with its relatively compact architecture outperforms all state-of-the-art models on CAMERA25 and REAL275 benchmarks by a large margin.
引用
收藏
页码:10651 / 10658
页数:8
相关论文
共 50 条
  • [1] An efficient network for category-level 6D object pose estimation
    Sun, Shantong
    Liu, Rongke
    Sun, Shuqiao
    Yang, Xinxin
    Lu, Guangshan
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (07) : 1643 - 1651
  • [2] An efficient network for category-level 6D object pose estimation
    Shantong Sun
    Rongke Liu
    Shuqiao Sun
    Xinxin Yang
    Guangshan Lu
    [J]. Signal, Image and Video Processing, 2021, 15 : 1643 - 1651
  • [3] Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation
    Tang, Kaifeng
    Xu, Chi
    Chen, Ming
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (17) : 53043 - 53063
  • [4] Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation
    Kaifeng Tang
    Chi Xu
    Ming Chen
    [J]. Multimedia Tools and Applications, 2024, 83 : 53043 - 53063
  • [5] A RGB-D feature fusion network for occluded object 6D pose estimation
    Song, Yiwei
    Tang, Chunhui
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 6309 - 6319
  • [6] Category-Level 6D Object Pose Estimation With Structure Encoder and Reasoning Attention
    Liu, Jierui
    Cao, Zhiqiang
    Tang, Yingbo
    Liu, Xilong
    Tan, Min
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6728 - 6740
  • [7] CatFormer: Category-Level 6D Object Pose Estimation with Transformer
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6808 - 6816
  • [8] RANSAC Optimization for Category-level 6D Object Pose Estimation
    Chen, Ying
    Kang, Guixia
    Wang, Yiping
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 50 - 56
  • [9] GSNet: Model Reconstruction Network for Category-level 6D Object Pose and Size Estimation
    Liu, Penglei
    Zhang, Qieshi
    Cheng, Jun
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 2898 - 2904
  • [10] KGNet: Knowledge-Guided Networks for Category-Level 6D Object Pose and Size Estimation
    Meng, Qiwei
    Gu, Jason
    Zhu, Shiqiang
    Liao, Jianfeng
    Jin, Tianlei
    Guo, Fangtai
    Wang, Wen
    Song, Wei
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6102 - 6108