CVML-Pose: Convolutional VAE Based Multi-Level Network for Object 3D Pose Estimation

被引：3

作者：

Zhao, Jianyu ^{[1
]}

Sanderson, Edward ^{[1
]}

Matuszewski, Bogdan J. J. ^{[1
]}

机构：

[1] Univ Cent Lancashire, Comp Vis & Machine Learning CVML Grp, Preston PR1 2HE, England

来源：

IEEE ACCESS | 2023年 / 11卷

基金：

英国工程与自然科学研究理事会;

关键词：

3D pose estimation; deep learning; variational autoencoder; synthetic data; 6D POSE;

D O I：

10.1109/ACCESS.2023.3243551

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Most vision-based 3D pose estimation approaches typically rely on knowledge of object's 3D model, depth measurements, and often require time-consuming iterative refinement to improve accuracy. However, these can be seen as limiting factors for broader real-life applications. The main motivation for this paper is to address these limitations. To solve this, a novel Convolutional Variational Auto-Encoder based Multi-Level Network for object 3D pose estimation (CVML-Pose) method is proposed. Unlike most other methods, the proposed CVML-Pose implicitly learns an object's 3D pose from only RGB images encoded in its latent space without knowing the object's 3D model, depth information, or performing a post-refinement. CVML-Pose consists of two main modules: (i) CVML-AE representing convolutional variational autoencoder, whose role is to extract features from RGB images, (ii) Multi-Layer Perceptron and K-Nearest Neighbor regressors mapping the latent variables to object 3D pose including, respectively, rotation and translation. The proposed CVML-Pose has been evaluated on the LineMod and LineMod-Occlusion benchmark datasets. It has been shown to outperform other methods based on latent representations and achieves comparable results to the state-of-the-art, but without use of a 3D model or depth measurements. Utilizing the t-Distributed Stochastic Neighbor Embedding algorithm, the CVML-Pose latent space is shown to successfully represent objects' category and topology. This opens up a prospect of integrated estimation of pose and other attributes (possibly also including surface finish or shape variations), which, with real-time processing due to the absence of iterative refinement, can facilitate various robotic applications. Code available: https://github.com/JZhao12/CVML-Pose.

引用

页码：13830 / 13845

页数：16

共 50 条

[31] 3D Object's Pose Estimation Based on Colored Markers Information
Gao, Xiang
Zhang, Chong
Zhang, Chungang
Guo, Xijuan
ADVANCES IN MECHATRONICS, AUTOMATION AND APPLIED INFORMATION TECHNOLOGIES, PTS 1 AND 2, 2014, 846-847 : 1162 - +
[32] 3D Object Pose Estimation from Binarized Images
Kagami, Shingo
Morita, Masaru
Hashimoto, Koichi
2012 PROCEEDINGS OF SICE ANNUAL CONFERENCE (SICE), 2012, : 759 - 761
[33] RGB-D Camera based 3D Object Pose Estimation and Grasping
Liang, Xiaoxiao
Cheng, Hongtai
2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1279 - 1284
[34] From Contours to 3D Object Detection and Pose Estimation
Payet, Nadia
Todorovic, Sinisa
2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2011, : 983 - 990
[35] 3D pose estimation based on planar object tracking for UAVs control
Mondragon, Ivan F.
Campoy, Pascual
Martinez, Carol
Olivares-Mendez, Miguel A.
2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 35 - 41
[36] Contour-based iterative pose estimation of 3D rigid object
Leng, D. W.
Sun, W. D.
IET COMPUTER VISION, 2011, 5 (05) : 291 - 300
[37] 3D Object Pose Estimation for Robotic Packing Applications
Rodriguez-Garavito, C. H.
Camacho-Munoz, Guillermo
Alvarez-Martinez, David
Viviano Cardenas, Karol
Mateo Rojas, David
Grimaldos, Andres
APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2018, PT II, 2018, 916 : 453 - 463
[38] 3D generic object categorization, localization and pose estimation
Savarese, Silvio
Fei-Fei, Li
2007 IEEE 11TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1-6, 2007, : 1245 - 1252
[39] Corner-based 3D Object Pose Estimation in Robot Vision
Zhang, Lei
Guo, Zhiyang
Chen, Huilin
Shuai, Liguo
2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC), VOL. 2, 2016, : 363 - 368
[40] Object Pose Estimation Method Based on 3D Key Points Voting
Wang T.
Yu E.
Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2024, 57 (03): : 291 - 300

← 1 2 3 4 5 →