YOLO-6D-Pose: Enhancing YOLO for Single-Stage Monocular Multi-Object 6D Pose Estimation

被引:0
|
作者
Maji, Debapriya [1 ]
Nagori, Soyeb [1 ]
Mathew, Manu [1 ]
Poddar, Deepak [1 ]
机构
[1] Texas Instruments Inc, Bangalore, India
关键词
D O I
10.1109/3DV62453.2024.00160
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Directly regressing 6 degrees of freedom for all the objects from a single RGB image is not well explored. Even end-to-end pose estimation approaches for a single object are inferior compared to state-of-the-art methods in terms of accuracy. Most 6D pose estimation frameworks are multi-stage relying on off-the-shelf deep networks for object and keypoint detection to establish correspondences between 3D object keypoints and 2D image locations. This is followed by applying a variant of a RANSAC-based Perspective-n-Point (PnP) followed by complex refinement operation. In this work, we propose a multi-object 6D pose estimation framework by enhancing the popular YOLOX object detector. The network is end-to-end trainable and detects each object along with its pose from a single RGB image without any additional post-processing. We show that by properly parameterizing the 6D pose and carefully designing the loss function, we can achieve state-of-theart accuracy without further refinement or any intermediate representations. YOLO-6D-Pose achieves SOTA results on YCBV and LMO dataset, surpassing all existing monocular approaches. We systematically analyze various 6D augmentations to verify their correctness and propose a new translation augmentation for this task. The network does not rely on any correspondences and is independent of the CAD model during inference. Code is available at https:// github. com/ TexasInstruments/ edgeai-yolox.
引用
收藏
页码:1616 / 1625
页数:10
相关论文
共 50 条
  • [31] Fundamental Coordinate Space for Object 6D Pose Estimation
    Wan, Boyan
    Zhang, Chen
    IEEE ACCESS, 2024, 12 : 146430 - 146440
  • [32] Graph neural network for 6D object pose estimation
    Yin, Pengshuai
    Ye, Jiayong
    Lin, Guoshen
    Wu, Qingyao
    KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [33] ConvPoseCNN: Dense Convolutional 6D Object Pose Estimation
    Capellen, Catherine
    Schwarz, Max
    Behnke, Sven
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 162 - 172
  • [34] Fast 6D Pose Estimation from a Monocular Image Using Hierarchical Pose Trees
    Konishi, Yoshinori
    Hanzawa, Yuki
    Kawade, Masato
    Hashimoto, Manabu
    COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 398 - 413
  • [35] Spatial and temporal consistency learning for monocular 6D pose estimation
    Zhang, Hong-Bo
    Liang, Jia-Yu
    Hong, Jia-Xin
    Lei, Qing
    Liu, Jing-Hua
    Du, Ji-Xiang
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 131
  • [36] Sparse Keypoint Models for 6D Object Pose Estimation
    Sadran, Emal
    Wurm, Kai M.
    Burschka, Darius
    2013 EUROPEAN CONFERENCE ON MOBILE ROBOTS (ECMR 2013), 2013, : 307 - 312
  • [37] Open-vocabulary object 6D pose estimation
    Corsetti, Jaime
    Boscaini, Davide
    Oh, Changjae
    Cavallaro, Andrea
    Poiesi, Fabio
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 18071 - 18080
  • [38] Focal segmentation for robust 6D object pose estimation
    Yuning Ye
    Hanhoon Park
    Multimedia Tools and Applications, 2024, 83 : 47563 - 47585
  • [39] Global Hypothesis Generation for 6D Object Pose Estimation
    Michel, Frank
    Kirillov, Alexander
    Brachmann, Eric
    Krull, Alexander
    Gumhold, Stefan
    Savchynskyy, Bogdan
    Rother, Carsten
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 115 - 124
  • [40] Anchor-Based 6D Object Pose Estimation
    Liu, Zehao
    Wang, Hao
    Liu, Fuchang
    2021 IEEE 7TH INTERNATIONAL CONFERENCE ON VIRTUAL REALITY (ICVR 2021), 2021, : 33 - 40