Category-Level 6D Object Pose Estimation With Structure Encoder and Reasoning Attention

被引：10

作者：

Liu, Jierui ^{[1
,2
]}

Cao, Zhiqiang ^{[1
,2
]}

Tang, Yingbo ^{[1
,2
]}

Liu, Xilong ^{[1
,2
]}

Tan, Min ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, State Key Lab Management & Control Complex Syst, Inst Automat, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2022年 / 32卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Shape; Three-dimensional displays; Cognition; Pose estimation; Feature extraction; Decoding; Solid modeling; Category-level; 6D object pose estimation; structure encoder; reasoning attention;

D O I：

10.1109/TCSVT.2022.3169144

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Category-level 6D object pose estimation has gained popularity and it is still challenging due to the diversity of different instances within the same category. In this paper, a novel category-level 6D object pose estimation framework with structure encoder and reasoning attention is proposed. A structure autoencoder is introduced to mine the shared structure features in the color images within the same category, via a distinct learning strategy that recovers the image of another instance but with the most similar pose to the input. On this basis, a reasoning attention decoder and full connected layers are stacked to form a rotation prediction network, where the structure features and 3D shape features are integrated and projected to a semantic space. The semantic space includes observed patterns and learnable patterns, which are better learned by adding a shortcut connection branch parallel to reasoning attention decoder with gradient decouple. Further reasoning based on these patterns endows the decoder with powerful feature representation. Without 3D object models, the proposed method models the attributes of category implicitly in the semantic space and better performance of 6D object pose estimation is guaranteed by reasoning on this space. The effectiveness of the proposed method is verified by the results on public datasets and actual experiments.

引用

页码：6728 / 6740

页数：13

共 50 条

[41] Corr-Track: Category-Level 6D Pose Tracking with Soft-Correspondence Matrix Estimation
Cao, Xin
Li, Jia
Zhao, Panpan
Li, Jiachen
Qin, Xueying
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (05) : 2173 - 2183
[42] Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation from Monocular RGB Image
Fan, Zhaoxin
Song, Zhenbo
Xu, Jian
Wang, Zhicheng
Wu, Kejian
Liu, Hongyan
He, Jun
COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 220 - 236
[43] 6D object pose estimation via viewpoint relation reasoning
Zhao, Wanqing
Zhang, Shaobo
Guan, Ziyu
Luo, Hangzai
Tang, Lei
Peng, Jinye
Fan, Jianping
NEUROCOMPUTING, 2020, 389 : 9 - 17
[44] CatTrack: Single-Stage Category-Level 6D Object Pose Tracking via Convolution and Vision Transformer
Yu, Sheng
Zhai, Di-Hua
Xia, Yuanqing
Li, Dong
Zhao, Shiqi
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1665 - 1680
[45] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism
Chen, Wei
Jia, Xi
Chang, Hyung Jin
Duan, Jinming
Shen, Linlin
Leonardis, Ales
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1581 - 1590
[46] Optimal Pose and Shape Estimation for Category-level 3D Object Perception
Shi, Jingnan
Yang, Heng
Carlone, Luca
ROBOTICS: SCIENCE AND SYSTEM XVII, 2021,
[47] Bi-directional attention based RGB-D fusion for category-level object pose and shape estimation
Tang, Kaifeng
Xu, Chi
Chen, Ming
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (17) : 53043 - 53063
[48] 6D Object Pose Estimation With Color/Geometry Attention Fusion
Yuan, Honglin
Veltkamp, Remco C.
16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 529 - 535
[49] Open-Vocabulary Category-Level Object Pose and Size Estimation
Cai, Junhao
He, Yisheng
Yuan, Weihao
Zhu, Siyu
Dong, Zilong
Bo, Liefeng
Chen, Qifeng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (09): : 7661 - 7668
[50] Spatial Attention Improves Iterative 6D Object Pose Estimation
Stevsic, Stefan
Hilliges, Otmar
2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, : 1070 - 1078

← 1 2 3 4 5 →