Generic 3D Representation via Pose Estimation and Matching

被引:31
|
作者
Zamir, Amir R. [1 ]
Wekel, Tilman [1 ]
Agrawal, Pulkit [2 ]
Wei, Colin [1 ]
Malik, Jitendra [2 ]
Savarese, Silvio [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
来源
关键词
Generic vision; Representation; Descriptor learning; Pose estimation; Wide-baseline matching; Street view;
D O I
10.1007/978-3-319-46487-9_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited. In this paper, we learn a generic 3D representation through solving a set of foundational proxy 3D tasks: object-centric camera pose estimation and wide baseline feature matching. Our method is based upon the premise that by providing supervision over a set of carefully selected foundational tasks, generalization to novel tasks and abstraction capabilities can be achieved. We empirically show that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks (e.g., scene layout estimation, object pose estimation, surface normal estimation) without the need for fine-tuning and shows traits of abstraction abilities (e.g., cross modality pose estimation). In the context of the core supervised tasks, we demonstrate our representation achieves state-of-the-art wide baseline feature matching results without requiring apriori rectification (unlike SIFT and the majority of learnt features). We also show 6DOF camera pose estimation given a pair local image patches. The accuracy of both supervised tasks come comparable to humans. Finally, we contribute a large-scale dataset composed of object-centric street view scenes along with point correspondences and camera pose information, and conclude with a discussion on the learned representation and open research questions.
引用
下载
收藏
页码:535 / 553
页数:19
相关论文
共 50 条
  • [21] Learning a person-independent representation for precise 3D pose estimation
    Yan, Shuicheng
    Zhang, Zhenqiu
    Fu, Yun
    Hu, Yuxiao
    Tu, Jilin
    Huang, Thomas
    MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 297 - 306
  • [22] Stabilization of 3D pose estimation
    Neddermeyer, W
    Schnell, M
    Winkler, W
    Lilienthal, A
    APPLICATIONS OF GEOMETRIC ALGEBRA IN COMPUTER SCIENCE AND ENGINEERING, 2002, : 385 - 394
  • [23] 3D ASSISTED FACE RECOGNITION VIA PROGRESSIVE POSE ESTIMATION
    Zhang, Wuming
    Huang, Di
    Samaras, Dimitris
    Morvan, Jean-Marie
    Wang, Yunhong
    Chen, Liming
    2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 728 - 732
  • [24] 3D Ego-Pose Estimation via Imitation Learning
    Yuan, Ye
    Kitani, Kris
    COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 763 - 778
  • [25] New algorithms for 2D and 3D point matching: Pose estimation and correspondence
    Gold, S
    Rangarajan, A
    Lu, CP
    Pappu, S
    Mjolsness, E
    PATTERN RECOGNITION, 1998, 31 (08) : 1019 - 1031
  • [26] 3D pose estimation by directly matching polyhedral models to gray value gradients
    Universitaet Karlsruhe , Karlsruhe, Germany
    Int J Comput Vision, 3 (283-302):
  • [27] 3D Pose Estimation by Directly Matching Polyhedral Models to Gray Value Gradients
    Henner Kollnig
    Hans-Hellmut Nagel
    International Journal of Computer Vision, 1997, 23 : 283 - 302
  • [28] 3D pose estimation by directly matching polyhedral models to gray value gradients
    Kollnig, H
    Nagel, HM
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 1997, 23 (03) : 283 - 302
  • [29] Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation
    Kundu, Jogendra Nath
    Seth, Siddharth
    Rahul, M., V
    Rakesh, Mugalodi
    Babu, R. Venkatesh
    Chakraborty, Anirban
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11312 - 11319
  • [30] OCR-Pose: Occlusion-aware Contrastive Representation for Unsupervised 3D Human Pose Estimation
    Wang, Junjie
    Yu, Zhenbo
    Tong, Zhengyan
    Wang, Hang
    Liu, Jinxian
    Zhang, Wenjun
    Wu, Xiaoyan
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5477 - 5485