A Multi-Modal Topic Model for Image Annotation Using Text Analysis

被引:6
|
作者
Tian, Jing [1 ]
Huang, Yu [1 ]
Guo, Zhi [1 ]
Qi, Xiang [1 ]
Chen, Ziyan [1 ]
Huang, Tinglei [1 ]
机构
[1] Chinese Acad Sci, Inst Elect, Key Lab Technol Geospatial Informat Proc & Applic, Beijing 100190, Peoples R China
基金
国家高技术研究发展计划(863计划);
关键词
Graphical models; image analysis; statistical learning; text analysis;
D O I
10.1109/LSP.2014.2375341
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most of the existing approaches for image annotation generally demand exactly labeled training data, which are often difficult to obtain. In this letter we present a novel model that utilizes the rich surrounding text of images to perform image annotation. Our work makes two main contributions. First, by integrating text analysis, words that describe the salient objects in images are extracted. Second, a new probabilistic topic model is built to jointly model image features, extracted words and surrounding text. Our model is demonstrated to be flexible enough to handle multi-modal features and provide better performance than the state-of-the-art annotation methods.
引用
收藏
页码:886 / 890
页数:5
相关论文
共 50 条
  • [1] SUPERVISED MULTI-MODAL TOPIC MODEL FOR IMAGE ANNOTATION
    Tran, Thu Hoai
    Choi, Seungjin
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] Topic Regression Multi-Modal Latent Dirichlet Allocation for Image Annotation
    Putthividhya, Duangmanee
    Attias, Hagai T.
    Nagarajan, Srikantan S.
    [J]. 2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, : 3408 - 3415
  • [3] A probabilistic semantic model for image annotation and multi-modal image retrieval
    Zhang, Ruofei
    Zhang, Zhongfei
    Li, Mingjing
    Ma, Wei-Ying
    Zhang, Hong-Jiang
    [J]. MULTIMEDIA SYSTEMS, 2006, 12 (01) : 27 - 33
  • [4] A probabilistic semantic model for image annotation and multi-modal image retrieval
    Zhang, RF
    Zhang, ZF
    Li, MJ
    Ma, WY
    Zhang, HJ
    [J]. TENTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 846 - 851
  • [5] A probabilistic semantic model for image annotation and multi-modal image retrieval
    Ruofei Zhang
    Zhongfei (Mark) Zhang
    Mingjing Li
    Wei-Ying Ma
    Hong-Jiang Zhang
    [J]. Multimedia Systems, 2006, 12 : 27 - 33
  • [6] Multi-Modal Event Topic Model for Social Event Analysis
    Qian, Shengsheng
    Zhang, Tianzhu
    Xu, Changsheng
    Shao, Jie
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (02) : 233 - 246
  • [7] Sparse Multi-Modal Topical Coding for Image Annotation
    Song, Lingyun
    Luo, Minnan
    Liu, Jun
    Zhang, Lingling
    Qian, Buyue
    Li, Max Haifei
    Zheng, Qinghua
    [J]. NEUROCOMPUTING, 2016, 214 : 162 - 174
  • [8] Multi-modal feature fusion for geographic image annotation
    Li, Ke
    Zou, Changqing
    Bu, Shuhui
    Liang, Yun
    Zhang, Jian
    Gong, Minglun
    [J]. PATTERN RECOGNITION, 2018, 73 : 1 - 14
  • [9] On the Effectiveness of Images in Multi-modal Text Classification: An Annotation Study
    Ma, Chunpeng
    Shen, Aili
    Yoshikawa, Hiyori
    Iwakura, Tomoya
    Beck, Daniel
    Baldwin, Timothy
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (03)
  • [10] Extractive Text-Image Summarization Using Multi-Modal RNN
    Chen, Jingqiang
    Hai Zhuge
    [J]. 2018 14TH INTERNATIONAL CONFERENCE ON SEMANTICS, KNOWLEDGE AND GRIDS (SKG), 2018, : 245 - 248