A probabilistic semantic model for image annotation and multi-modal image retrieval

被引:18
|
作者
Zhang, Ruofei [1 ]
Zhang, Zhongfei
Li, Mingjing
Ma, Wei-Ying
Zhang, Hong-Jiang
机构
[1] SUNY Binghamton, Dept Comp Sci, Binghamton, NY 13902 USA
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
image annotation; multi-modal image retrieval; probabilistic semantic model; evaluation;
D O I
10.1007/s00530-006-0025-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.
引用
收藏
页码:27 / 33
页数:7
相关论文
共 50 条
  • [21] Semantic annotation and retrieval of image collections
    Osman, Taha
    Thakker, Dhavalkumar
    Schaefer, Gerald
    Leroy, Maxime
    Fournier, Alain
    21ST EUROPEAN CONFERENCE ON MODELLING AND SIMULATION ECMS 2007: SIMULATIONS IN UNITED EUROPE, 2007, : 324 - +
  • [22] Semantic Cohesion for Image Annotation and Retrieval
    Jair Escalante, Hugo
    Enrique Sucar, Luis
    Montes-y-Gomez, Manuel
    COMPUTACION Y SISTEMAS, 2012, 16 (01): : 121 - 126
  • [23] A Multi-modal SPM Model for Image Classification
    Zheng, Peng
    Zhao, Zhong-Qiu
    Gao, Jun
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2017, PT III, 2017, 10363 : 525 - 535
  • [24] Fabric image retrieval based on multi-modal feature fusion
    Ning Zhang
    Yixin Liu
    Zhongjian Li
    Jun Xiang
    Ruru Pan
    Signal, Image and Video Processing, 2024, 18 : 2207 - 2217
  • [25] Multi-modal unsupervised domain adaptation for semantic image segmentation
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    PATTERN RECOGNITION, 2023, 137
  • [26] Fabric image retrieval based on multi-modal feature fusion
    Zhang, Ning
    Liu, Yixin
    Li, Zhongjian
    Xiang, Jun
    Pan, Ruru
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (03) : 2207 - 2217
  • [27] Privacy-Preserving Image Retrieval with Multi-Modal Query
    Zhou, Fucai
    Zhang, Zongye
    Hou, Ruiwei
    COMPUTER JOURNAL, 2023, 67 (05): : 1979 - 1992
  • [28] Leveraging multi-modal fusion for graph-based image annotation
    Amiri, S. Hamid
    Jamzad, Mansour
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2018, 55 : 816 - 828
  • [29] Topic Regression Multi-Modal Latent Dirichlet Allocation for Image Annotation
    Putthividhya, Duangmanee
    Attias, Hagai T.
    Nagarajan, Srikantan S.
    2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 3408 - 3415
  • [30] Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model (vol 30, pg 76, 2020)
    Yin, Chun-yan
    Chen, Yong-Heng
    Zuo, Wan-li
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2020, 30 (03) : 566 - 566