Compositional Scene Representation Learning via Reconstruction: A Survey

被引：7

作者：

Yuan, Jinyang ^{[1
]}

Chen, Tonglin ^{[1
]}

Li, Bin ^{[1
]}

Xue, Xiangyang ^{[1
]}

机构：

[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200433, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Autoencoders; compositional scene representations; image reconstruction; neural networks; object-centric learning;

D O I：

10.1109/TPAMI.2023.3286184

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual scenes are composed of visual concepts and have the property of combinatorial explosion. An important reason for humans to efficiently learn from diverse visual scenes is the ability of compositional perception, and it is desirable for artificial intelligence to have similar abilities. Compositional scene representation learning is a task that enables such abilities. In recent years, various methods have been proposed to apply deep neural networks, which have been proven to be advantageous in representation learning, to learn compositional scene representations via reconstruction, advancing this research direction into the deep learning era. Learning via reconstruction is advantageous because it may utilize massive unlabeled data and avoid costly and laborious data annotation. In this survey, we first outline the current progress on reconstruction-based compositional scene representation learning with deep neural networks, including development history and categorizations of existing methods from the perspectives of the modeling of visual scenes and the inference of scene representations; then provide benchmarks, including an open source toolbox to reproduce the benchmark experiments, of representative methods that consider the most extensively studied problem setting and form the foundation for other methods; and finally discuss the limitations of existing methods and future directions of this research topic.

引用

页码：11540 / 11560

页数：21

共 50 条

[31] Learning a Hierarchical Compositional Representation of Multiple Object Classes
Leonardis, Ales
2009 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPR WORKSHOPS 2009), VOLS 1 AND 2, 2009, : 529 - 529
[32] Network Representation Learning: A Survey
Zhang, Daokun
Yin, Jie
Zhu, Xingquan
Zhang, Chengqi
IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (01) : 3 - 28
[33] A Survey on Hypergraph Representation Learning
Antelmi, Alessia
Cordasco, Gennaro
Polato, Mirko
Scarano, Vittorio
Spagnuolo, Carmine
Yang, Dingqi
ACM COMPUTING SURVEYS, 2024, 56 (01)
[34] Survey on program representation learning
Ma J.-C.
Di X.-X.
Duan Z.-T.
Tang L.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (01): : 155 - 169
[35] Graph representation learning: a survey
Chen, Fenxiao
Wang, Yun-Cheng
Wang, Bin
Kuo, C. -C. Jay
APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2020, 9
[36] A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning
Zhang, Rui
Trisedya, Bayu Distiawan
Li, Miao
Jiang, Yong
Qi, Jianzhong
VLDB JOURNAL, 2022, 31 (05): : 1143 - 1168
[37] A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning
Rui Zhang
Bayu Distiawan Trisedya
Miao Li
Yong Jiang
Jianzhong Qi
The VLDB Journal, 2022, 31 : 1143 - 1168
[38] Remote Sensing Scene Classification by Unsupervised Representation Learning
Lu, Xiaoqiang
Zheng, Xiangtao
Yuan, Yuan
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2017, 55 (09): : 5148 - 5157
[39] Unsupervised Learning of Compositional Scene Representations from Multiple Unspecified Viewpoints
Yuan, Jinyang
Li, Bin
Xue, Xiangyang
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8971 - 8979
[40] Disentangling Visual Priors: Unsupervised Learning of Scene Interpretations with Compositional Autoencoder
Krawiec, Krzysztof
Nowinowski, Antoni
NEURAL-SYMBOLIC LEARNING AND REASONING, PT I, NESY 2024, 2024, 14979 : 240 - 256

← 1 2 3 4 5 →