CatTrack: Single-Stage Category-Level 6D Object Pose Tracking via Convolution and Vision Transformer

被引:0
|
作者
Yu, Sheng [1 ]
Zhai, Di-Hua [1 ,2 ]
Xia, Yuanqing [1 ]
Li, Dong [3 ]
Zhao, Shiqi [3 ]
机构
[1] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Yangtze Delta Reg Acad, Jiaxing 314001, Peoples R China
[3] China Unicom Res Inst, Beijing 102676, Peoples R China
基金
中国国家自然科学基金;
关键词
pose tracking; transformer; Pose estimation;
D O I
10.1109/TMM.2023.3284598
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the current research, many researchers have focused on instance-level pose tracking, which requires a 3D model of the object in advance, making it challenging to apply in practice. To address this limitation, some researchers have proposed the category-level object pose tracking method. Achieving accurate and speedy monocular category-level pose tracking is an essential research goal. In this article, we propose CatTrack, a new single-stage keypoints-based monocular category-level multi-object pose tracking network. A significant issue in object pose tracking tasks is utilizing the information from the previous frame to guide pose estimation for the next frame. However, as the object poses and camera information in each frame are different, we need to remove irrelevant information and emphasize useful features. To this end, we propose a transformer-based temporal information capture module to leverage the position information of keypoints from the previous frame. Furthermore, we propose a new keypoint matching module to enable the grouping and matching of object keypoints in complex scenes. We have successfully applied CatTrack to the Objectron dataset and achieved superior results in comparison to existing methods. Furthermore, we have also evaluated the generalization of CatTrack and successfully applied it to track the 6D pose of unseen real-world objects.
引用
收藏
页码:1665 / 1680
页数:16
相关论文
共 50 条
  • [1] CatTrack: Single-Stage Category-Level 6D Object Pose Tracking via Convolution and Vision Transformer
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    Li, Dong
    Zhao, Shiqi
    [J]. IEEE Transactions on Multimedia, 2024, 26 : 1665 - 1680
  • [2] CatFormer: Category-Level 6D Object Pose Estimation with Transformer
    Yu, Sheng
    Zhai, Di-Hua
    Xia, Yuanqing
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6808 - 6816
  • [3] An efficient network for category-level 6D object pose estimation
    Sun, Shantong
    Liu, Rongke
    Sun, Shuqiao
    Yang, Xinxin
    Lu, Guangshan
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (07) : 1643 - 1651
  • [4] Category-Level 6D Object Pose Recovery in Depth Images
    Sahin, Caner
    Kim, Tae-Kyun
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT I, 2019, 11129 : 665 - 681
  • [5] RANSAC Optimization for Category-level 6D Object Pose Estimation
    Chen, Ying
    Kang, Guixia
    Wang, Yiping
    [J]. 2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 50 - 56
  • [6] An efficient network for category-level 6D object pose estimation
    Shantong Sun
    Rongke Liu
    Shuqiao Sun
    Xinxin Yang
    Guangshan Lu
    [J]. Signal, Image and Video Processing, 2021, 15 : 1643 - 1651
  • [7] Single-Stage 6D Object Pose Estimation
    Hu, Yinlin
    Fua, Pascal
    Wang, Wei
    Salzmann, Mathieu
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2927 - 2936
  • [8] 6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning
    Zou, Lu
    Huang, Zhangjin
    Gu, Naijie
    Wang, Guoping
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6907 - 6921
  • [9] Category-Level 6D Object Pose Estimation via Cascaded Relation and Recurrent Reconstruction Networks
    Wang, Jiaze
    Chen, Kai
    Dou, Qi
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4807 - 4814
  • [10] Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation
    Wang, He
    Sridhar, Srinath
    Huang, Jingwei
    Valentin, Julien
    Song, Shuran
    Guibas, Leonidas J.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2637 - 2646