Scene Graph Prediction with Limited Labels

被引：3

作者：

Chen, Vincent S. ^{[1
]}

Varma, Paroma ^{[1
]}

Krishna, Ranjay ^{[1
]}

Bernstein, Michael ^{[1
]}

Re, Christopher ^{[1
]}

Fei-Fei, Li ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW) | 2019年

关键词：

D O I：

10.1109/ICCVW.2019.00220

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Visual knowledge bases such as Visual Genome power numerous applications in computer vision, including visual question answering and captioning, but suffer from sparse, incomplete relationships. All scene graph models to date are limited to training on a small set of visual relationships that have thousands of training labels each. Hiring human annotators is expensive, and using textual knowledge base completion methods are incompatible with visual data. In this paper, we introduce a semi-supervised method that assigns probabilistic relationship labels to a large number of unlabeled images using few labeled examples. We analyze visual relationships to suggest two types of image-agnostic features that are used to generate noisy heuristics, whose outputs are aggregated using a factor graph-based generative model. With as few as 10 labeled examples per relationship, the generative model creates enough training data to train any existing state-of-the-art scene graph model. We demonstrate that our method outperforms all baseline approaches on scene graph prediction by 5.16 recall@ 100 for PREDCLS. In our limited label setting, we define a complexity metric for relationships that serves as an indicator (R-2 = 0.778) for conditions under which our method succeeds over transfer learning, the de facto approach for training with limited labels.

引用

页码：1772 / 1782

页数：11

共 50 条

[21] Pedestrian Intention Prediction Based on Traffic-Aware Scene Graph Model
Song, Xingchen
Kang, Miao
Zhou, Sanping
Wang, Jianji
Mao, Yishu
Zheng, Nanning
2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 9851 - 9858
[22] Prediction and Generation of 3D Functional Scene Based on Relation Graph
Sun Q.
Hu R.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2022, 34 (09): : 1351 - 1361
[23] Beware of Overcorrection: Scene-induced Commonsense Graph for Scene Graph Generation
Chen, Lianggangxu
Lu, Jiale
Song, Youqi
Wang, Changbo
He, Gaoqi
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2888 - 2897
[24] A Graph-Based Hyperspectral Change Detection Framework Using Difference Augmentation and Progressive Reconstruction With Limited Labels
Yang, Bin
Cheng, Xinwei
Chen, Wei
Ye, Xin
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 14
[25] Reviewing Labels: Label Graph Network with Top-k Prediction Set for Relation Extraction
Li, Bo
Ye, Wei
Zhang, Jinglei
Zhang, Shikun
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13051 - 13058
[26] Incremental 3D Semantic Scene Graph Prediction from RGB Sequences
Wu, Shun-Cheng
Tateno, Keisuke
Navab, Nassir
Tombari, Federico
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5064 - 5074
[27] Zero-Shot Scene Graph Relation Prediction Through Commonsense Knowledge Integration
Kan, Xuan
Cui, Hejie
Yang, Carl
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 466 - 482
[28] Towards Traffic Scene Description: The Semantic Scene Graph
Zipfl, Maximilian
Zoellner, J. Marius
2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 3748 - 3755
[29] Knowledge-inspired 3D Scene Graph Prediction in Point Cloud
Zhang, Shoulong
Li, Shuai
Hao, Aimin
Qin, Hong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[30] Graph Relabeling with Privileged Edge Labels
Techaploog, Wiriya
Kantabutra, Sanpawat
ECTI-CON: 2009 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2009, : 638 - 641

← 1 2 3 4 5 →