Certifiable Object Pose Estimation: Foundations, Learning Models, and Self-Training

被引：3

作者：

Talak, Rajat ^{[1
]}

Peng, Lisa R. ^{[1
,2
]}

Carlone, Luca ^{[1
]}

机构：

[1] MIT, Lab Informat & Decis Syst, Cambridge, MA 02139 USA

[2] Ample, San Francisco, CA 94107 USA

来源：

IEEE TRANSACTIONS ON ROBOTICS | 2023年 / 39卷 / 04期

基金：

美国国家科学基金会;

关键词：

Certifiable models; computer vision; 3D robot vision; object pose estimation; safe perception; self-supervised learning; PREDICTION;

D O I：

10.1109/TRO.2023.3271568

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

In this article, we consider a certifiable object pose estimation problem, where-given a partial point cloud of an object-the goal is to not only estimate the object pose, but also provide a certificate of correctness for the resulting estimate. Our first contribution is a general theory of certification for end-to-end perception models. In particular, we introduce the notion of ?-correctness, which bounds the distance between an estimate and the ground truth. We then show that ?-correctness can be assessed by implementing two certificates: 1) a certificate of observable correctness, which asserts if the model output is consistent with the input data and prior information; and 2) a certificate of nondegeneracy, which asserts whether the input data are sufficient to compute a unique estimate. Our second contribution is to apply this theory and design a new learning-based certifiable pose estimator. In particular, we propose C-3PO, a semantic-keypoint-based pose estimation model, augmented with the two certificates, to solve the certifiable pose estimation problem. C-3PO also includes a keypoint corrector, implemented as a differentiable optimization layer, that can correct large detection errors (e.g., due to the sim-to-real gap). Our third contribution is a novel self-supervised training approach that uses our certificate of observable correctness to provide the supervisory signal to C-3PO during training. In it, the model trains only on the observably correct input-output pairs produced in each batch and at each iteration. As training progresses, we see that the observably correct input-output pairs grow, eventually reaching near 100% in many cases. We conduct extensive experiments to evaluate the performance of the corrector, the certification, and the proposed self-supervised training using the ShapeNet and YCB datasets. The experiments show that 1) standard semantic-keypoint-based methods (which constitute the backbone of C-3PO) outperform more recent alternatives in challenging problem instances; 2) C-3PO further improves performance and significantly outperforms all the baselines; and 3) C-3PO's certificates are able to discern correct pose estimates.(1)

引用

页码：2805 / 2824

页数：20

共 50 条

[41] Changing grasp position on a wielded object provides self-training for the perception of length
Abney, Drew H.
Wagman, Jeffrey B.
Schneider, W. Joel
ATTENTION PERCEPTION & PSYCHOPHYSICS, 2014, 76 (01) : 247 - 254
[42] Open-vocabulary object detection via debiased curriculum self-training
Zhang, Hanlue
Guan, Dayan
Ke, Xiangrui
El Saddik, Abdulmotaleb
Lu, Shijian
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
[43] Underwater Object Detection and Pose Estimation using Deep Learning
Jeon, MyungHwan
Lee, Yeongjun
Shin, Young-Sik
Jang, Hyesu
Kim, Ayoung
IFAC PAPERSONLINE, 2019, 52 (21): : 78 - 81
[44] Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
Wang, Shaolei
Wang, Zhongyuan
Che, Wanxiang
Liu, Ting
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 1813 - 1822
[45] ASBERT: ASR-SPECIFIC SELF-SUPERVISED LEARNING WITH SELF-TRAINING
Kim, Hyung Yong
Kim, Byeong-Yeol
Yoo, Seung Woo
Lim, Youshin
Lim, Yunkyu
Lee, Hanbin
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 9 - 14
[46] A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds
Wang, Yi
Hou, Junhui
Hou, Xinyu
Chau, Lap-Pui
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2876 - 2887
[47] Planar Pose Estimation Using Object Detection and Reinforcement Learning
Rasmussen, Frederik Norby
Andersen, Sebastian Terp
Grossmann, Bjarne
Boukas, Evangelos
Nalpantidis, Lazaros
COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 353 - 365
[48] Pose Guided RGBD Feature Learning for 3D Object Pose Estimation
Balntas, Vassileios
Doumanoglou, Andreas
Sahin, Caner
Sock, Juil
Kouskouridas, Rigas
Kim, Tae-Kyun
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3876 - 3884
[49] Dynamic updating self-training for semi-weakly supervised object detection
Zhang, Ming
Liu, Shuaicheng
Zeng, Bing
NEUROCOMPUTING, 2023, 547
[50] Learning How to Self-Learn: Enhancing Self-Training Using Neural Reinforcement Learning
Chen, Chenhua
Zhang, Yue
Gao, Yuze
2018 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2018, : 25 - 30

← 1 2 3 4 5 →