Multi-Object Representation Learning via Feature Connectivity and Object-Centric Regularization

被引:0
|
作者
Foo, Alex [1 ]
Hsu, Wynne [1 ]
Lee, Mong Li [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Discovering object-centric representations from images has the potential to greatly improve the robustness, sample efficiency and interpretability of machine learning algorithms. Current works on multi-object images typically follow a generative approach that optimizes for input reconstruction and fail to scale to real-world datasets despite significant increases in model capacity. We address this limitation by proposing a novel method that leverages feature connectivity to cluster neighboring pixels likely to belong to the same object. We further design two object-centric regularization terms to refine object representations in the latent space, enabling our approach to scale to complex real-world images. Experimental results on simulated, real-world, complex texture and common object images demonstrate a substantial improvement in the quality of discovered objects compared to state-of-the-art methods, as well as the sample efficiency and generalizability of our approach. We also show that the discovered object-centric representations can accurately predict key object properties in downstream tasks, highlighting the potential of our method to advance the field of multi-object representation learning.
引用
下载
收藏
页数:13
相关论文
共 50 条
  • [21] Slot-VPS: Object-centric Representation Learning for Video Panoptic Segmentation
    Zhou, Yi
    Zhang, Hui
    Lee, Hana
    Sun, Shuyang
    Li, Pingjun
    Zhu, Yangguang
    Yoo, ByungIn
    Qi, Xiaojuan
    Han, Jae-Joon
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3083 - 3093
  • [22] Object-centric Learning with Capsule Networks: A Survey
    Ribeiro, Fabio De Sousa
    Duarte, Kevin
    Everett, Miles
    Leontidis, Georgios
    Shah, Mubarak
    ACM COMPUTING SURVEYS, 2024, 56 (11)
  • [23] Learning Object-Centric Transformation for Video Prediction
    Chen, Xiongtao
    Wang, Wenmin
    Wang, Jinzhuo
    Li, Weimian
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1503 - 1511
  • [24] ASIMO: Agent-centric scene representation in multi-object manipulation
    Min, Cheol-Hui
    Kim, Young Min
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2024, : 22 - 64
  • [25] Learning Discriminative Proposal Representation for Multi-object Tracking
    Huang, Yejia
    Liu, Xianqin
    Zhang, Yijun
    Hu, Jian-Fang
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2023, 14356 LNCS : 300 - 310
  • [26] Multi-Object Representation Learning with Iterative Variational Inference
    Greff, Klaus
    Kaufman, Raphael Lopez
    Kabra, Rishabh
    Watters, Nick
    Burgess, Chris
    Zoran, Daniel
    Matthey, Loic
    Botvinick, Matthew
    Lerchner, Alexander
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [27] Effective Multi-Object Tracking via Global Object Models and Object Constraint Learning
    Yoo, Yong-Sang
    Lee, Seong-Ho
    Bae, Seung-Hwan
    SENSORS, 2022, 22 (20)
  • [28] Object-Centric Slot Diffusion
    Jiang, Jindong
    Deng, Fei
    Singh, Gautam
    Ahn, Sungjin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [29] Learning global spatial information for multi-view object-centric models
    Kobayashi, Yuya
    Suzuki, Masahiro
    Matsuo, Yutaka
    ADVANCED ROBOTICS, 2023, 37 (13) : 828 - 839
  • [30] Self-supervised Object-Centric Learning for Videos
    Aydemir, Gorkay
    Xie, Weidi
    Guney, Fatma
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,