Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework

被引:0
|
作者
Li, Li-Jia [1 ]
Socher, Richard [1 ]
Li Fei-Fei [1 ]
机构
[1] Princeton Univ, Dept Comp Sci, Princeton, NJ 08544 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given an image, we propose a hierarchical generative model that classifies the overall scene, recognizes and segments each object component, as well as annotates the image with a list of tags. To our knowledge, this is the first model that performs all three tasks in one coherent framework. For instance, a scene of a 'polo game' consists of several visual objects such as 'human', 'horse', 'grass', etc. In addition, it can be further annotated with a list of more abstract (e.g. 'dusk') or visually less salient (e.g. 'saddle') tags. Our generative model jointly explains images through a visual model and a textual model. Visually relevant objects are represented by regions and patches, while visually irrelevant textual annotations are influenced directly by the overall scene class. Vile propose a fully automatic learning framework that is able to learn robust scene models from noisy web data such as images and user tags from Flickr.com. We demonstrate the effectiveness of our framework by automatically classifying, annotating and segmenting images from eight classes depicting sport scenes. In all three tasks, our model significantly outperforms state-of-the-art algorithms.
引用
收藏
页码:2036 / 2043
页数:8
相关论文
共 50 条
  • [1] Towards a Scene-based Video Annotation Framework
    Getahun, Fekade
    Birara, Mekuanent
    [J]. 2015 11TH INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2015, : 306 - 313
  • [2] Towards Automatic Image Annotation Supporting Document Understanding
    Markowska-Kaczmar, Urszula
    Minda, Pawel
    Ociepa, Krzysztof
    Olszowy, Dariusz
    Pawlikowski, Roman
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PART I, 2011, 6678 : 420 - 427
  • [3] Automatic Dense Annotation for Monocular 3D Scene Understanding
    Reza, Md Alimoor
    Chen, Kai
    Naik, Akshay
    Crandall, David J.
    Jung, Soon-Heung
    [J]. IEEE ACCESS, 2020, 8 : 68852 - 68865
  • [4] An automatic approach towards audio segmentation and classification
    Pan, Wenjuan
    Wang, Zongwu
    Liu, Zhijing
    [J]. PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 405 - 408
  • [5] Automatic segmentation and annotation in radiology
    Dankerl, P.
    Cavallaro, A.
    Uder, M.
    Hammon, M.
    [J]. RADIOLOGE, 2014, 54 (03): : 265 - 270
  • [7] Two-stage Automatic Image Annotation Based on Latent Semantic Scene Classification
    Ge, Hongwei
    Zhang, Kai
    Hou, Yaqing
    Yu, Chao
    Zhao, Mingde
    Wang, Zhen
    Sun, Liang
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [8] Insights into Image Understanding: Segmentation Methods for Object Recognition and Scene Classification
    Mohammed, Sarfaraz Ahmed
    Ralescu, Anca L.
    [J]. ALGORITHMS, 2024, 17 (05)
  • [9] Automatic Image Annotation Based on Scene Analysis
    Liu, Yongmei
    Wongwitit, Tanakrit
    Yu, Linsen
    [J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2014, 14 (03)
  • [10] SCENE-BASED AUTOMATIC IMAGE ANNOTATION
    Tariq, Amara
    Foroosh, Hassan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 3047 - 3051