Object-centric Learning with Capsule Networks: A Survey

被引：0

作者：

Ribeiro, Fabio De Sousa ^{[1
]}

Duarte, Kevin ^{[2
]}

Everett, Miles ^{[3
]}

Leontidis, Georgios ^{[3
]}

Shah, Mubarak ^{[2
]}

机构：

[1] Imperial Coll London, Dept Comp, London, England

[2] Univ Cent Florida, Elect Engn & Comp Sci, Orlando, FL 32816 USA

[3] Univ Aberdeen, Dept Comp Sci, Aberdeen, Scotland

来源：

ACM COMPUTING SURVEYS | 2024年 / 56卷 / 11期

关键词：

Deep learning; capsule networks; deep neural networks; convolutional neural networks; transformers; routing-by-agreement; self-attention; representation learning; object-centric learning; generative models; computer vision; ATTENTION; IMAGES;

D O I：

10.1145/3674500

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Capsule networks emerged as a promising alternative to convolutional neural networks for learning object-centric representations. The idea is to explicitly model part-whole hierarchies by using groups of neurons called capsules to encode visual entities, then learn the relationships between these entities dynamically from data. However, a major hurdle for capsule network research has been the lack of a reliable point of reference for understanding their foundational ideas and motivations. This survey provides a comprehensive and critical overview of capsule networks, which aims to serve as a main point of reference going forward. To that end, we introduce the fundamental concepts and motivations behind capsule networks, such as equivariant inference. We then cover various technical advances in capsule routing algorithms as well as alternative geometric and generative formulations. We provide a detailed explanation of how capsule networks relate to the attention mechanism in Transformers and uncover non-trivial conceptual similarities between them in the context of object-centric representation learning. We also review the extensive applications of capsule networks in computer vision, video and motion, graph representation learning, natural language processing, medical imaging, and many others. To conclude, we provide an in-depth discussion highlighting promising directions for future work.

引用

页数：43

共 50 条

[1] Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation
Wang, Jiayu
Hu, Chuxiong
Wang, Yunan
Zhu, Yu
[J]. IEEE ACCESS, 2021, 9 : 68277 - 68288
[2] Dynamics Learning with Object-Centric Interaction Networks for Robot Manipulation
Wang, Jiayu
Hu, Chuxiong
Wang, Yunan
Zhu, Yu
[J]. IEEE Access, 2021, 9 : 68277 - 68288
[3] Provably Learning Object-Centric Representations
Brady, Jack
Zimmermann, Roland S.
Sharma, Yash
Schoelkopf, Bernhard
von Kuegelgen, Julius
Brendel, Wieland
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
[4] Object-Centric Learning with Slot Attention
Locatello, Francesco
Weissenborn, Dirk
Unterthiner, Thomas
Mahendran, Aravindh
Heigold, Georg
Uszkoreit, Jakob
Dosovitskiy, Alexey
Kipf, Thomas
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[5] Generalization and Robustness Implications in Object-Centric Learning
Dittadi, Andrea
Papa, Samuele
De Vita, Michele
Scholkopf, Bernhard
Winther, Ole
Locatello, Francesco
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[6] Learning Object-Centric Transformation for Video Prediction
Chen, Xiongtao
Wang, Wenmin
Wang, Jinzhuo
Li, Weimian
[J]. PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1503 - 1511
[7] Object-Centric Debugging
Ressia, Jorge
Bergel, Alexandre
Nierstrasz, Oscar
[J]. 2012 34TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), 2012, : 485 - 495
[8] Object-Centric Representation Learning for Video Scene Understanding
Zhou, Yi
Zhang, Hui
Park, Seung-In
Yoo, ByungIn
Qi, Xiaojuan
[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (12) : 8410 - 8423
[9] Self-supervised Object-Centric Learning for Videos
Aydemir, Gorkay
Xie, Weidi
Guney, Fatma
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[10] Deep Object-Centric Representations for Generalizable Robot Learning
Devin, Coline
Abbeel, Pieter
Darrell, Trevor
Levine, Sergey
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 7111 - 7118

← 1 2 3 4 5 →