Self-supervised Wide Baseline Visual Servoing via 3D Equivariance

被引:1
|
作者
Huh, Jinwook [1 ]
Hong, Jungseok [1 ]
Garg, Suveer [1 ]
Park, Hyun Soo [1 ]
Isler, Volkan [1 ]
机构
[1] Samsung AI Ctr NY, 837 Washington St, New York, NY 10014 USA
关键词
D O I
10.1109/IROS47612.2022.9981907
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the challenging input settings for visual servoing is when the initial and goal camera views are far apart. Such settings are difficult because the wide baseline can cause drastic changes in object appearance and cause occlusions. This paper presents a novel self-supervised visual servoing method for wide baseline images which does not require 3D ground truth supervision. Existing approaches that regress absolute camera pose with respect to an object require 3D ground truth data of the object in the forms of 3D bounding boxes or meshes. We learn a coherent visual representation by leveraging a geometric property called 3D equivariance-the representation is transformed in a predictable way as a function of 3D transformation. To ensure that the feature-space is faithful to the underlying geodesic space, a geodesic preserving constraint is applied in conjunction with the equivariance. We design a Siamese network that can effectively enforce these two geometric properties without requiring 3D supervision. With the learned model, the relative transformation can be inferred simply by following the gradient in the learned space and used as feedback for closed-loop visual servoing. Our method is evaluated on objects from the YCB dataset, showing meaningful outperformance on a visual servoing task, or object alignment task with respect to state-of-the-art approaches that use 3D supervision. Ours yields more than 35% average distance error reduction and more than 90% success rate with 3cm error tolerance.
引用
收藏
页码:2227 / 2233
页数:7
相关论文
共 50 条
  • [1] Self-supervised rigid transformation equivariance for accurate 3D point cloud registration
    Zhang, Zhiyuan
    Sun, Jiadai
    Dai, Yuchao
    Zhou, Dingfu
    Song, Xibin
    He, Mingyi
    [J]. PATTERN RECOGNITION, 2022, 130
  • [2] Visual Reinforcement Learning With Self-Supervised 3D Representations
    Ze, Yanjie
    Hansen, Nicklas
    Chen, Yinbo
    Jain, Mohit
    Wang, Xiaolong
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (05) : 2890 - 2897
  • [3] Self-Supervised 3D Face Reconstruction via Conditional Estimation
    Wen, Yandong
    Liu, Weiyang
    Raj, Bhiksha
    Singh, Rita
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13269 - 13278
  • [4] Self-Supervised Implicit 3D Reconstruction via RGB-D Scans
    Yang, Hongji
    Liu, Jiao
    Lu, Shaoping
    Ren, Bo
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1115 - 1120
  • [5] Self-supervised Secondary Landmark Detection via 3D Representation Learning
    Bala, Praneet
    Zimmermann, Jan
    Park, Hyun Soo
    Hayden, Benjamin Y.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 1980 - 1994
  • [6] Self-supervised Secondary Landmark Detection via 3D Representation Learning
    Praneet Bala
    Jan Zimmermann
    Hyun Soo Park
    Benjamin Y. Hayden
    [J]. International Journal of Computer Vision, 2023, 131 : 1980 - 1994
  • [7] Consistent 3D Hand Reconstruction in Video via Self-Supervised Learning
    Tu, Zhigang
    Huang, Zhisheng
    Chen, Yujin
    Kang, Di
    Bao, Linchao
    Yang, Bisheng
    Yuan, Junsong
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 9469 - 9485
  • [8] Self-Supervised Representation Learning from Flow Equivariance
    Xiong, Yuwen
    Ren, Mengye
    Zeng, Wenyuan
    Urtasun, Raquel
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10171 - 10180
  • [9] 3D Self-Supervised Methods for Medical Imaging
    Taleb, Aiham
    Loetzsch, Winfried
    Danz, Noel
    Severin, Julius
    Gaertner, Thomas
    Bergner, Benjamin
    Lippert, Christoph
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NEURIPS 2020), 2020, 33
  • [10] Learning 3D Photography Videos via Self-supervised Diffusion on Single Images
    Wang, Xiaodong
    Wu, Chenfei
    Yin, Shengming
    Ni, Minheng
    Wang, Jianfeng
    Li, Linjie
    Yang, Zhengyuan
    Yang, Fan
    Wang, Lijuan
    Liu, Zicheng
    Fang, Yuejian
    Duan, Nan
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1506 - 1514