Language-Mediated, Object-Centric Representation Learning

被引:0
|
作者
Wang, Ruocheng [1 ]
Mao, Jiayuan [2 ]
Gershman, Samuel J. [3 ]
Wu, Jiajun
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] MIT CSAIL, Cambridge, MA USA
[3] Harvard Univ, Cambridge, MA 02138 USA
关键词
INDIVIDUATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Language-mediated, Object-centric Representation Learning (LORL), a paradigm for learning disentangled, object-centric scene representations from vision and language. LORL builds upon recent advances in unsupervised object discovery and segmentation, notably MONet and Slot Attention. While these algorithms learn an object-centric representation just by reconstructing the input image, LORL enables them to further learn to associate the learned representations to concepts, i.e., words for object categories, properties, and spatial relationships, from language input. These object-centric concepts derived from language facilitate the learning of object-centric representations. LORL can be integrated with various unsupervised object discovery algorithms that are language-agnostic. Experiments show that the integration of LORL consistently improves the performance of unsupervised object discovery methods on two datasets via the help of language. We also show that concepts learned by LORL, in conjunction with object discovery methods, aid downstream tasks such as referring expression comprehension.
引用
下载
收藏
页码:2033 / 2046
页数:14
相关论文
共 50 条
  • [31] Dynamics Learning With Object-Centric Interaction Networks for Robot Manipulation
    Wang, Jiayu
    Hu, Chuxiong
    Wang, Yunan
    Zhu, Yu
    IEEE ACCESS, 2021, 9 : 68277 - 68288
  • [32] Object-Centric Predictive Process Monitoring
    Gherissi, Wissam
    El Haddad, Joyce
    Grigori, Daniela
    SERVICE-ORIENTED COMPUTING - ICSOC 2022 WORKSHOPS, 2023, 13821 : 27 - 39
  • [33] OCπ: Object-Centric Process Insights
    Adams, Jan Niklas
    van der Aalst, Wil M. P.
    APPLICATION AND THEORY OF PETRI NETS AND CONCURRENCY (PETRI NETS 2022), 2022, 13288 : 139 - 150
  • [34] Object-Centric Unsupervised Image Captioning
    Meng, Zihang
    Yang, David
    Cao, Xuefei
    Shah, Ashish
    Lim, Ser-Nam
    COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 219 - 235
  • [35] OPerA: Object-Centric Performance Analysis
    Park, Gyunam
    Adams, Jan Niklas
    van der Aalst, Wil M. P.
    CONCEPTUAL MODELING (ER 2022), 2022, 13607 : 281 - 292
  • [36] Discovering Object-centric Petri Nets
    van der Aalst, Wil M. P.
    Berti, Alessandro
    FUNDAMENTA INFORMATICAE, 2020, 175 (1-4) : 1 - 40
  • [37] Learning global object-centric representations via disentangled slot attention
    Tonglin Chen
    Yinxuan Huang
    Zhimeng Shen
    Jinghao Huang
    Bin Li
    Xiangyang Xue
    Machine Learning, 2025, 114 (2)
  • [38] Learning Object-Centric Dynamic Modes from Video and Emerging Properties
    Comas, Armand
    Fernandez-Lopez, Christian
    Ghimire, Sandesh
    Li, Haolin
    Sznaier, Mario
    Camps, Octavia
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [39] Permission Analysis for Object-Centric Processes
    Breitmayer, Marius
    Arnold, Lisa
    Reichert, Manfred
    INTELLIGENT INFORMATION SYSTEMS, CAISE FORUM 2024, 2024, 520 : 11 - 19
  • [40] Data Dreaming for Object Detection: Learning Object-Centric State Representations for Visual Imitation
    Sieb, Maximilian
    Fragkiadaki, Katerina
    2018 IEEE-RAS 18TH INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS (HUMANOIDS), 2018, : 806 - 813