Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework

被引：0

作者：

Li, Li-Jia ^{[1
]}

Socher, Richard ^{[1
]}

Li Fei-Fei ^{[1
]}

机构：

[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA

来源：

CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4 | 2009年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Given an image, we propose a hierarchical generative model that classifies the overall scene, recognizes and segments each object component, as well as annotates the image with a list of tags. To our knowledge, this is the first model that performs all three tasks in one coherent framework. For instance, a scene of a 'polo game' consists of several visual objects such as 'human', 'horse', 'grass', etc. In addition, it can be further annotated with a list of more abstract (e.g. 'dusk') or visually less salient (e.g. 'saddle') tags. Our generative model jointly explains images through a visual model and a textual model. Visually relevant objects are represented by regions and patches, while visually irrelevant textual annotations are influenced directly by the overall scene class. Vile propose a fully automatic learning framework that is able to learn robust scene models from noisy web data such as images and user tags from Flickr.com. We demonstrate the effectiveness of our framework by automatically classifying, annotating and segmenting images from eight classes depicting sport scenes. In all three tasks, our model significantly outperforms state-of-the-art algorithms.

引用

页码：2036 / 2043

页数：8

共 50 条

[1] Towards a Scene-based Video Annotation Framework
Getahun, Fekade
Birara, Mekuanent
[J]. 2015 11TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2015, : 306 - 313
[2] Towards Automatic Image Annotation Supporting Document Understanding
Markowska-Kaczmar, Urszula
Minda, Pawel
Ociepa, Krzysztof
Olszowy, Dariusz
Pawlikowski, Roman
[J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PART I, 2011, 6678 : 420 - 427
[3] Automatic Dense Annotation for Monocular 3D Scene Understanding
Reza, Md Alimoor
Chen, Kai
Naik, Akshay
Crandall, David J.
Jung, Soon-Heung
[J]. IEEE ACCESS, 2020, 8 : 68852 - 68865
[4] An automatic approach towards audio segmentation and classification
Pan, Wenjuan
Wang, Zongwu
Liu, Zhijing
[J]. PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 405 - 408
[5] Automatic segmentation and annotation in radiology
Dankerl, P.
Cavallaro, A.
Uder, M.
Hammon, M.
[J]. RADIOLOGE, 2014, 54 (03): : 265 - 270
[6] Towards a point cloud understanding framework for forest scene semantic segmentation across forest types and sensor platforms
[J]. Li, Bowen (libowen@caf.ac.cn), 2025, 318
[7] Two-stage Automatic Image Annotation Based on Latent Semantic Scene Classification
Ge, Hongwei
Zhang, Kai
Hou, Yaqing
Yu, Chao
Zhao, Mingde
Wang, Zhen
Sun, Liang
[J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[8] Insights into Image Understanding: Segmentation Methods for Object Recognition and Scene Classification
Mohammed, Sarfaraz Ahmed
Ralescu, Anca L.
[J]. ALGORITHMS, 2024, 17 (05)
[9] Automatic Image Annotation Based on Scene Analysis
Liu, Yongmei
Wongwitit, Tanakrit
Yu, Linsen
[J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2014, 14 (03)
[10] SCENE-BASED AUTOMATIC IMAGE ANNOTATION
Tariq, Amara
Foroosh, Hassan
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 3047 - 3051

← 1 2 3 4 5 →