CubeNet: Equivariance to 3D Rotation and Translation

被引:60
|
作者
Worrall, Daniel [1 ]
Brostow, Gabriel [1 ]
机构
[1] UCL, Comp Sci Dept, London, England
来源
基金
英国工程与自然科学研究理事会;
关键词
Deep learning; Equivariance; 3D representations; INVARIANCE;
D O I
10.1007/978-3-030-01228-1_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D Convolutional Neural Networks are sensitive to transformations applied to their input. This is a problem because a voxelized version of a 3D object, and its rotated clone, will look unrelated to each other after passing through to the last layer of a network. Instead, an idealized model would preserve a meaningful representation of the voxelized object, while explaining the pose-difference between the two inputs. An equivariant representation vector has two components: the invariant identity part, and a discernable encoding of the transformation. Models that can't explain pose-differences risk "diluting" the representation, in pursuit of optimizing a classification or regression loss function. We introduce a Group Convolutional Neural Network with linear equivariance to translations and right angle rotations in three dimensions. We call this network CubeNet, reflecting its cube-like symmetry. By construction, this network helps preserve a 3D shape's global and local signature, as it is transformed through successive layers. We apply this network to a variety of 3D inference problems, achieving state-of-the-art on the ModelNet10 classification challenge, and comparable performance on the ISBI 2012 Connectome Segmentation Benchmark. To the best of our knowledge, this is the first 3D rotation equivariant CNN for voxel representations.
引用
收藏
页码:585 / 602
页数:18
相关论文
共 50 条
  • [1] Rotation invariance and equivariance in 3D deep learning: a survey
    Fei, Jiajun
    Deng, Zhidong
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (07)
  • [2] Harmonic Networks: Deep Translation and Rotation Equivariance
    Worrall, Daniel E.
    Garbin, Stephan J.
    Turmukhambetov, Daniyar
    Brostow, Gabriel J.
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7168 - 7177
  • [3] Rotation and Translation Invariant 3D Descriptor for Surfaces
    Hampp, Joshua
    Bormann, Richard
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 709 - 716
  • [4] Translation Scaling and Rotation invariants of 3D Krawtchouk moments
    Amakdouf, Hicham
    Zouhri, Amal
    El Mallahi, Mostafa
    Tahiri, Ahmed
    Qjidaa, Hassan
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND COMPUTER VISION (ISCV2018), 2018,
  • [5] 3D Rotation and Translation for Hyperbolic Knowledge Graph Embedding
    Zhu, Yihua
    Shimodaira, Hidetoshi
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1497 - 1515
  • [6] ROTATION, TRANSLATION, SIZE AND ILLUMINATION INVARIANCES IN 3D OBJECT RECOGNITION
    BRICOLO, E
    BULTHOFF, HH
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 1993, 34 (04) : 1081 - 1081
  • [7] Ambisonic Signal Processing DNNs Guaranteeing Rotation, Scale and Time Translation Equivariance
    Sato, Ryotaro
    Niwa, Kenta
    Kobayashi, Kazunori
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1449 - 1462
  • [8] A Rotation, Translation, and Scaling Invariant Fourier Transform of 3D Image Function
    Chukanov, S. N.
    OPTOELECTRONICS INSTRUMENTATION AND DATA PROCESSING, 2008, 44 (03) : 249 - 255
  • [9] A rotation, translation, and scaling invariant Fourier transform of 3D image function
    S. N. Chukanov
    Optoelectronics, Instrumentation and Data Processing, 2008, 44 (3)
  • [10] Rotation Scaling and Translation Invariants of 3D Radial Shifted Legendre Moments
    El Mallahi M.
    El Mekkaoui J.
    Zouhri A.
    Amakdouf H.
    Qjidaa H.
    International Journal of Automation and Computing, 2018, 15 (2) : 169 - 180