Object-centric Learning with Capsule Networks: A Survey

被引:0
|
作者
Ribeiro, Fabio De Sousa [1 ]
Duarte, Kevin [2 ]
Everett, Miles [3 ]
Leontidis, Georgios [3 ]
Shah, Mubarak [2 ]
机构
[1] Imperial Coll London, Dept Comp, London, England
[2] Univ Cent Florida, Elect Engn & Comp Sci, Orlando, FL 32816 USA
[3] Univ Aberdeen, Dept Comp Sci, Aberdeen, Scotland
关键词
Deep learning; capsule networks; deep neural networks; convolutional neural networks; transformers; routing-by-agreement; self-attention; representation learning; object-centric learning; generative models; computer vision; ATTENTION; IMAGES;
D O I
10.1145/3674500
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Capsule networks emerged as a promising alternative to convolutional neural networks for learning object-centric representations. The idea is to explicitly model part-whole hierarchies by using groups of neurons called capsules to encode visual entities, then learn the relationships between these entities dynamically from data. However, a major hurdle for capsule network research has been the lack of a reliable point of reference for understanding their foundational ideas and motivations. This survey provides a comprehensive and critical overview of capsule networks, which aims to serve as a main point of reference going forward. To that end, we introduce the fundamental concepts and motivations behind capsule networks, such as equivariant inference. We then cover various technical advances in capsule routing algorithms as well as alternative geometric and generative formulations. We provide a detailed explanation of how capsule networks relate to the attention mechanism in Transformers and uncover non-trivial conceptual similarities between them in the context of object-centric representation learning. We also review the extensive applications of capsule networks in computer vision, video and motion, graph representation learning, natural language processing, medical imaging, and many others. To conclude, we provide an in-depth discussion highlighting promising directions for future work.
引用
收藏
页数:43
相关论文
共 50 条
  • [1] Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation
    Wang, Jiayu
    Hu, Chuxiong
    Wang, Yunan
    Zhu, Yu
    [J]. IEEE ACCESS, 2021, 9 : 68277 - 68288
  • [2] Dynamics Learning with Object-Centric Interaction Networks for Robot Manipulation
    Wang, Jiayu
    Hu, Chuxiong
    Wang, Yunan
    Zhu, Yu
    [J]. IEEE Access, 2021, 9 : 68277 - 68288
  • [3] Provably Learning Object-Centric Representations
    Brady, Jack
    Zimmermann, Roland S.
    Sharma, Yash
    Schoelkopf, Bernhard
    von Kuegelgen, Julius
    Brendel, Wieland
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [4] Object-Centric Learning with Slot Attention
    Locatello, Francesco
    Weissenborn, Dirk
    Unterthiner, Thomas
    Mahendran, Aravindh
    Heigold, Georg
    Uszkoreit, Jakob
    Dosovitskiy, Alexey
    Kipf, Thomas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] Generalization and Robustness Implications in Object-Centric Learning
    Dittadi, Andrea
    Papa, Samuele
    De Vita, Michele
    Scholkopf, Bernhard
    Winther, Ole
    Locatello, Francesco
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Learning Object-Centric Transformation for Video Prediction
    Chen, Xiongtao
    Wang, Wenmin
    Wang, Jinzhuo
    Li, Weimian
    [J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1503 - 1511
  • [7] Object-Centric Debugging
    Ressia, Jorge
    Bergel, Alexandre
    Nierstrasz, Oscar
    [J]. 2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2012, : 485 - 495
  • [8] Object-Centric Representation Learning for Video Scene Understanding
    Zhou, Yi
    Zhang, Hui
    Park, Seung-In
    Yoo, ByungIn
    Qi, Xiaojuan
    [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (12) : 8410 - 8423
  • [9] Self-supervised Object-Centric Learning for Videos
    Aydemir, Gorkay
    Xie, Weidi
    Guney, Fatma
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Deep Object-Centric Representations for Generalizable Robot Learning
    Devin, Coline
    Abbeel, Pieter
    Darrell, Trevor
    Levine, Sergey
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 7111 - 7118