Compositional Scene Representation Learning via Reconstruction: A Survey

被引：7

作者：

Yuan, Jinyang ^{[1
]}

Chen, Tonglin ^{[1
]}

Li, Bin ^{[1
]}

Xue, Xiangyang ^{[1
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Autoencoders; compositional scene representations; image reconstruction; neural networks; object-centric learning;

D O I：

10.1109/TPAMI.2023.3286184

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual scenes are composed of visual concepts and have the property of combinatorial explosion. An important reason for humans to efficiently learn from diverse visual scenes is the ability of compositional perception, and it is desirable for artificial intelligence to have similar abilities. Compositional scene representation learning is a task that enables such abilities. In recent years, various methods have been proposed to apply deep neural networks, which have been proven to be advantageous in representation learning, to learn compositional scene representations via reconstruction, advancing this research direction into the deep learning era. Learning via reconstruction is advantageous because it may utilize massive unlabeled data and avoid costly and laborious data annotation. In this survey, we first outline the current progress on reconstruction-based compositional scene representation learning with deep neural networks, including development history and categorizations of existing methods from the perspectives of the modeling of visual scenes and the inference of scene representations; then provide benchmarks, including an open source toolbox to reproduce the benchmark experiments, of representative methods that consider the most extensively studied problem setting and form the foundation for other methods; and finally discuss the limitations of existing methods and future directions of this research topic.

引用

页码：11540 / 11560

页数：21

共 50 条

[41] Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer
Zhang, Wenrui
Yang, Ling
Geng, Shijia
Hong, Shenda
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16129 - 16138
[42] Unsupervised seismic reconstruction via deep learning with one-dimensional signal representation
Chen, Gui
Liu, Yang
Zhang, Mi
Sun, Yuhang
Zhang, Haoran
COMPUTERS & GEOSCIENCES, 2025, 200
[43] Active Scene Understanding via Online Semantic Reconstruction
Zheng, Lintao
Zhu, Chenyang
Zhang, Jiazhao
Zhao, Hang
Huang, Hui
Niessner, Matthias
Xu, Kai
COMPUTER GRAPHICS FORUM, 2019, 38 (07) : 103 - 114
[44] Looking Closer at the Scene: Multiscale Representation Learning for Remote Sensing Image Scene Classification
Wang, Qi
Huang, Wei
Xiong, Zhitong
Li, Xuelong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1414 - 1428
[45] UNSUPERVISED LEARNING OF COMPOSITIONAL SPARSE CODE FOR NATURAL IMAGE REPRESENTATION
Hong, Yi
Si, Zhangzhang
Hu, Wenze
Zhu, Song-Chun
Wu, Ying Nian
QUARTERLY OF APPLIED MATHEMATICS, 2014, 72 (02) : 373 - 406
[46] Visual Scene Reconstruction Using a Bayesian Learning Framework
Bourouis, Sami
Bouguila, Nizar
Li, Yexing
Azam, Muhammad
IMAGE AND SIGNAL PROCESSING (ICISP 2018), 2018, 10884 : 225 - 232
[47] Rule-Guided Compositional Representation Learning on Knowledge Graphs
Niu, Guanglin
Zhang, Yongfei
Li, Bo
Cui, Peng
Liu, Si
Li, Jingyang
Zhang, Xiaowei
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 2950 - 2958
[48] CORL: Compositional Representation Learning for Few-Shot Classification
He, Ju
Kortylewski, Adam
Yuille, Alan
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 3879 - 3888
[49] Deep video representation learning: a survey
Ravanbakhsh, Elham
Liang, Yongqing
Ramanujam, J.
Li, Xin
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (20) : 59195 - 59225
[50] Survey on Trajectory Representation Learning Techniques
Cao H.-L.
Tang H.-N.
Wang F.
Xu Y.-J.
Ruan Jian Xue Bao/Journal of Software, 2021, 32 (05): : 1461 - 1479

← 1 2 3 4 5 →