A probabilistic semantic model for image annotation and multi-modal image retrieval

被引:18
|
作者
Zhang, Ruofei [1 ]
Zhang, Zhongfei
Li, Mingjing
Ma, Wei-Ying
Zhang, Hong-Jiang
机构
[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
image annotation; multi-modal image retrieval; probabilistic semantic model; evaluation;
D O I
10.1007/s00530-006-0025-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.
引用
收藏
页码:27 / 33
页数:7
相关论文
共 50 条
  • [1] A probabilistic semantic model for image annotation and multi-modal image retrieval
    Zhang, RF
    Zhang, ZF
    Li, MJ
    Ma, WY
    Zhang, HJ
    TENTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 846 - 851
  • [2] A probabilistic semantic model for image annotation and multi-modal image retrieval
    Ruofei Zhang
    Zhongfei (Mark) Zhang
    Mingjing Li
    Wei-Ying Ma
    Hong-Jiang Zhang
    Multimedia Systems, 2006, 12 : 27 - 33
  • [3] Multi-modal Image Retrieval for Search-based Image Annotation with RF
    Budikova, Petra
    Batko, Michal
    Zezula, Pavel
    2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 52 - 60
  • [4] Semantic relationships in multi-modal graphs for automatic image annotation
    Stathopoulos, Vassilios
    Urban, Jana
    Jose, Joemon
    ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 490 - 497
  • [5] SUPERVISED MULTI-MODAL TOPIC MODEL FOR IMAGE ANNOTATION
    Tran, Thu Hoai
    Choi, Seungjin
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [6] Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model
    Yin, Chun-yan
    Chen, Yong-Heng
    Zuo, Wan-li
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2020, 30 (01) : 76 - 86
  • [7] Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model
    Chun-yan Yin
    Yong-Heng Chen
    Wan-li Zuo
    Pattern Recognition and Image Analysis, 2020, 30 : 76 - 86
  • [8] Deep Image Annotation and Classification by Fusing Multi-Modal Semantic Topics
    Chen, YongHeng
    Zhang, Fuquan
    Zuo, WanLi
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2018, 12 (01): : 392 - 412
  • [9] Two-Probabilistic Latent Semantic Model for Image Annotation and Retrieval
    Watcharapinchai, Nattachai
    Aramvith, Supavadee
    Siddhichai, Supakorn
    COMPUTER VISION - ACCV 2010 WORKSHOPS, PT I, 2011, 6468 : 359 - 369
  • [10] Multi-modal semantic image segmentation
    Pemasiri, Akila
    Kien Nguyen
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202