YOLO-6D-Pose: Enhancing YOLO for Single-Stage Monocular Multi-Object 6D Pose Estimation

被引：0

作者：

Maji, Debapriya ^{[1
]}

Nagori, Soyeb ^{[1
]}

Mathew, Manu ^{[1
]}

Poddar, Deepak ^{[1
]}

机构：

[1] Texas Instruments Inc, Bangalore, India

来源：

2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024 | 2024年

关键词：

D O I：

10.1109/3DV62453.2024.00160

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Directly regressing 6 degrees of freedom for all the objects from a single RGB image is not well explored. Even end-to-end pose estimation approaches for a single object are inferior compared to state-of-the-art methods in terms of accuracy. Most 6D pose estimation frameworks are multi-stage relying on off-the-shelf deep networks for object and keypoint detection to establish correspondences between 3D object keypoints and 2D image locations. This is followed by applying a variant of a RANSAC-based Perspective-n-Point (PnP) followed by complex refinement operation. In this work, we propose a multi-object 6D pose estimation framework by enhancing the popular YOLOX object detector. The network is end-to-end trainable and detects each object along with its pose from a single RGB image without any additional post-processing. We show that by properly parameterizing the 6D pose and carefully designing the loss function, we can achieve state-of-theart accuracy without further refinement or any intermediate representations. YOLO-6D-Pose achieves SOTA results on YCBV and LMO dataset, surpassing all existing monocular approaches. We systematically analyze various 6D augmentations to verify their correctness and propose a new translation augmentation for this task. The network does not rely on any correspondences and is independent of the CAD model during inference. Code is available at https:// github. com/ TexasInstruments/ edgeai-yolox.

引用

页码：1616 / 1625

页数：10

共 50 条

[31] Fundamental Coordinate Space for Object 6D Pose Estimation
Wan, Boyan
Zhang, Chen
IEEE ACCESS, 2024, 12 : 146430 - 146440
[32] Graph neural network for 6D object pose estimation
Yin, Pengshuai
Ye, Jiayong
Lin, Guoshen
Wu, Qingyao
KNOWLEDGE-BASED SYSTEMS, 2021, 218
[33] ConvPoseCNN: Dense Convolutional 6D Object Pose Estimation
Capellen, Catherine
Schwarz, Max
Behnke, Sven
PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 162 - 172
[34] Fast 6D Pose Estimation from a Monocular Image Using Hierarchical Pose Trees
Konishi, Yoshinori
Hanzawa, Yuki
Kawade, Masato
Hashimoto, Manabu
COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 398 - 413
[35] Spatial and temporal consistency learning for monocular 6D pose estimation
Zhang, Hong-Bo
Liang, Jia-Yu
Hong, Jia-Xin
Lei, Qing
Liu, Jing-Hua
Du, Ji-Xiang
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 131
[36] Sparse Keypoint Models for 6D Object Pose Estimation
Sadran, Emal
Wurm, Kai M.
Burschka, Darius
2013 EUROPEAN CONFERENCE ON MOBILE ROBOTS (ECMR 2013), 2013, : 307 - 312
[37] Open-vocabulary object 6D pose estimation
Corsetti, Jaime
Boscaini, Davide
Oh, Changjae
Cavallaro, Andrea
Poiesi, Fabio
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 18071 - 18080
[38] Focal segmentation for robust 6D object pose estimation
Yuning Ye
Hanhoon Park
Multimedia Tools and Applications, 2024, 83 : 47563 - 47585
[39] Global Hypothesis Generation for 6D Object Pose Estimation
Michel, Frank
Kirillov, Alexander
Brachmann, Eric
Krull, Alexander
Gumhold, Stefan
Savchynskyy, Bogdan
Rother, Carsten
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 115 - 124
[40] Anchor-Based 6D Object Pose Estimation
Liu, Zehao
Wang, Hao
Liu, Fuchang
2021 IEEE 7TH INTERNATIONAL CONFERENCE ON VIRTUAL REALITY (ICVR 2021), 2021, : 33 - 40

← 1 2 3 4 5 →