Certifiable Object Pose Estimation: Foundations, Learning Models, and Self-Training

被引:3
|
作者
Talak, Rajat [1 ]
Peng, Lisa R. [1 ,2 ]
Carlone, Luca [1 ]
机构
[1] MIT, Lab Informat & Decis Syst, Cambridge, MA 02139 USA
[2] Ample, San Francisco, CA 94107 USA
基金
美国国家科学基金会;
关键词
Certifiable models; computer vision; 3D robot vision; object pose estimation; safe perception; self-supervised learning; PREDICTION;
D O I
10.1109/TRO.2023.3271568
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this article, we consider a certifiable object pose estimation problem, where-given a partial point cloud of an object-the goal is to not only estimate the object pose, but also provide a certificate of correctness for the resulting estimate. Our first contribution is a general theory of certification for end-to-end perception models. In particular, we introduce the notion of ?-correctness, which bounds the distance between an estimate and the ground truth. We then show that ?-correctness can be assessed by implementing two certificates: 1) a certificate of observable correctness, which asserts if the model output is consistent with the input data and prior information; and 2) a certificate of nondegeneracy, which asserts whether the input data are sufficient to compute a unique estimate. Our second contribution is to apply this theory and design a new learning-based certifiable pose estimator. In particular, we propose C-3PO, a semantic-keypoint-based pose estimation model, augmented with the two certificates, to solve the certifiable pose estimation problem. C-3PO also includes a keypoint corrector, implemented as a differentiable optimization layer, that can correct large detection errors (e.g., due to the sim-to-real gap). Our third contribution is a novel self-supervised training approach that uses our certificate of observable correctness to provide the supervisory signal to C-3PO during training. In it, the model trains only on the observably correct input-output pairs produced in each batch and at each iteration. As training progresses, we see that the observably correct input-output pairs grow, eventually reaching near 100% in many cases. We conduct extensive experiments to evaluate the performance of the corrector, the certification, and the proposed self-supervised training using the ShapeNet and YCB datasets. The experiments show that 1) standard semantic-keypoint-based methods (which constitute the backbone of C-3PO) outperform more recent alternatives in challenging problem instances; 2) C-3PO further improves performance and significantly outperforms all the baselines; and 3) C-3PO's certificates are able to discern correct pose estimates.(1)
引用
收藏
页码:2805 / 2824
页数:20
相关论文
共 50 条
  • [41] Changing grasp position on a wielded object provides self-training for the perception of length
    Abney, Drew H.
    Wagman, Jeffrey B.
    Schneider, W. Joel
    ATTENTION PERCEPTION & PSYCHOPHYSICS, 2014, 76 (01) : 247 - 254
  • [42] Open-vocabulary object detection via debiased curriculum self-training
    Zhang, Hanlue
    Guan, Dayan
    Ke, Xiangrui
    El Saddik, Abdulmotaleb
    Lu, Shijian
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [43] Underwater Object Detection and Pose Estimation using Deep Learning
    Jeon, MyungHwan
    Lee, Yeongjun
    Shin, Young-Sik
    Jang, Hyesu
    Kim, Ayoung
    IFAC PAPERSONLINE, 2019, 52 (21): : 78 - 81
  • [44] Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
    Wang, Shaolei
    Wang, Zhongyuan
    Che, Wanxiang
    Liu, Ting
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1813 - 1822
  • [45] ASBERT: ASR-SPECIFIC SELF-SUPERVISED LEARNING WITH SELF-TRAINING
    Kim, Hyung Yong
    Kim, Byeong-Yeol
    Yoo, Seung Woo
    Lim, Youshin
    Lim, Yunkyu
    Lee, Hanbin
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 9 - 14
  • [46] A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds
    Wang, Yi
    Hou, Junhui
    Hou, Xinyu
    Chau, Lap-Pui
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2876 - 2887
  • [47] Planar Pose Estimation Using Object Detection and Reinforcement Learning
    Rasmussen, Frederik Norby
    Andersen, Sebastian Terp
    Grossmann, Bjarne
    Boukas, Evangelos
    Nalpantidis, Lazaros
    COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 353 - 365
  • [48] Pose Guided RGBD Feature Learning for 3D Object Pose Estimation
    Balntas, Vassileios
    Doumanoglou, Andreas
    Sahin, Caner
    Sock, Juil
    Kouskouridas, Rigas
    Kim, Tae-Kyun
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3876 - 3884
  • [49] Dynamic updating self-training for semi-weakly supervised object detection
    Zhang, Ming
    Liu, Shuaicheng
    Zeng, Bing
    NEUROCOMPUTING, 2023, 547
  • [50] Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning
    Chen, Chenhua
    Zhang, Yue
    Gao, Yuze
    2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 25 - 30