Learning Disentangled Representation for Multi-View 3D Object Recognition

被引:23
|
作者
Huang, Jingjia [1 ]
Yan, Wei [1 ]
Li, Ge [1 ]
Li, Thomas [2 ]
Liu, Shan [3 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China
[2] Peking Univ, AIIT, Hangzhou 100871, Peoples R China
[3] Tencent Media Lab, Palo Alto, CA 94301 USA
关键词
Three-dimensional displays; Solid modeling; Feature extraction; Task analysis; Computer architecture; Object recognition; Computational modeling; Multi-view 3D object; object recognition; disentangled representation; FEATURES;
D O I
10.1109/TCSVT.2021.3062190
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
3D object recognition is a hot research topic. Particularly, view-based methods, which represent a 3D object with a collection of its rendered views on the 2D domain, play an important role in this field. Currently, view-based researches tend to aggregate information from multiple views via pooling based strategies to endow the models with the characteristic of view permutation invariance, at the cost of inevitable loss of useful features. In this paper, we introduce a new method that learns a more comprehensive descriptor for a 3D object from its views while successfully keeping its robustness to the variation of view permutation. Our method disentangles the information in the set of multi-view images into a global category-related feature and a set of view-permutation related features. To unbind these two parts, an encode-decoder based disentangling architecture is proposed, which barely bring extra computations compared to the baseline model. Systematic experiments are conducted for this new method to demonstrates the effectiveness and the competitive performance based on ModelNet40, ModelNet10, and ShapeNetCore55 datasets. Codes for our paper will be released soon on "https://github.com/hjjpku/multi_view_sort".
引用
收藏
页码:646 / 659
页数:14
相关论文
共 50 条
  • [41] Learning disentangled user representation with multi-view information fusion on social networks
    Tang, Wenyi
    Hui, Bei
    Tian, Ling
    Luo, Guangchun
    He, Zaobo
    Cai, Zhipeng
    INFORMATION FUSION, 2021, 74 : 77 - 86
  • [42] A Multi-View Probabilistic Model for 3D Object Classes
    Sun, Min
    Su, Hao
    Savarese, Silvio
    Li Fei-Fei
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 1247 - +
  • [43] A Compact Multi-View Descriptor for 3D Object Retrieval
    Daras, Petros
    Axenopoulos, Apostolos
    CBMI: 2009 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2009, : 115 - 119
  • [44] Dynamic View Aggregation for Multi-View 3D Shape Recognition
    Zhou, Yuan
    Sun, Zhongqi
    Huo, Shuwei
    Kung, Sun-Yuan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9163 - 9174
  • [45] Viewpoint Equivariance for Multi-View 3D Object Detection
    Chen, Dian
    Li, Jie
    Guizilini, Vitor
    Ambrus, Rares
    Gaidon, Adrien
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9213 - 9222
  • [46] Drcnn: Dynamic routing convolutional neural network for multi-view 3d object recognition
    Sun, Kai
    Zhang, Jiangshe
    Liu, Junmin
    Yu, Ruixuan
    Song, Zengjie
    IEEE Transactions on Image Processing, 2021, 30 : 868 - 877
  • [47] DRCNN: Dynamic Routing Convolutional Neural Network for Multi-View 3D Object Recognition
    Sun, Kai
    Zhang, Jiangshe
    Liu, Junmin
    Yu, Ruixuan
    Song, Zengjie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 868 - 877
  • [48] Object-based encoding for multi-view sequences of 3D object
    Yi, J
    Rhee, K
    Kim, S
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2002, 17 (03) : 293 - 304
  • [49] Multi-view Manhole Detection, Recognition, and 3D Localisation
    Timofte, Radu
    Van Gool, Luc
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [50] SparseDet: Towards efficient multi-view 3D object detection via sparse scene representation
    Li, Jingzhong
    Yang, Lin
    Shi, Zhen
    Chen, Yuxuan
    Jin, Yue
    Akiyama, Kanta
    Xu, Anze
    ADVANCED ENGINEERING INFORMATICS, 2024, 62