A Multi-Modal Topic Model for Image Annotation Using Text Analysis

被引:6
|
作者
Tian, Jing [1 ]
Huang, Yu [1 ]
Guo, Zhi [1 ]
Qi, Xiang [1 ]
Chen, Ziyan [1 ]
Huang, Tinglei [1 ]
机构
[1] Chinese Acad Sci, Inst Elect, Key Lab Technol Geospatial Informat Proc & Applic, Beijing 100190, Peoples R China
基金
国家高技术研究发展计划(863计划);
关键词
Graphical models; image analysis; statistical learning; text analysis;
D O I
10.1109/LSP.2014.2375341
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Most of the existing approaches for image annotation generally demand exactly labeled training data, which are often difficult to obtain. In this letter we present a novel model that utilizes the rich surrounding text of images to perform image annotation. Our work makes two main contributions. First, by integrating text analysis, words that describe the salient objects in images are extracted. Second, a new probabilistic topic model is built to jointly model image features, extracted words and surrounding text. Our model is demonstrated to be flexible enough to handle multi-modal features and provide better performance than the state-of-the-art annotation methods.
引用
收藏
页码:886 / 890
页数:5
相关论文
共 50 条
  • [41] Multi-modal multi-concept-based deep neural network for automatic image annotation
    Haijiao Xu
    Changqin Huang
    Xiaodi Huang
    Muxiong Huang
    [J]. Multimedia Tools and Applications, 2019, 78 : 30651 - 30675
  • [42] Statistical and Visual Analysis of Audio, Text, and Image Features for Multi-Modal Music Genre Recognition
    Wilkes, Ben
    Vatolkin, Igor
    Mueller, Heinrich
    [J]. ENTROPY, 2021, 23 (11)
  • [43] Social multi-modal event analysis via knowledge-based weighted topic model
    Xue, Feng
    Sun, Jian
    Liu, Xueliang
    Liu, Tianpeng
    Lu, Qiang
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 1 - 8
  • [44] A Multi-Modal Approach to Emotion Recognition using Undirected Topic Models
    Shah, Mohit
    Chakrabarti, Chaitali
    Spanias, Andreas
    [J]. 2014 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2014, : 754 - 757
  • [45] Multi-Modal Image Registration Using Structural Features
    Kasiri, Keyvan
    Clausi, David A.
    Fieguth, Paul
    [J]. 2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 5550 - 5553
  • [46] BLR: A Multi-modal Sentiment Analysis Model
    Yang Yang
    Ye Zhonglin
    Zhao Haixing
    Li Gege
    Cao Shujuan
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PART X, 2023, 14263 : 466 - 478
  • [47] GIT-Mol: A multi-modal large language model for molecular science with graph, image, and text
    Liu, Pengfei
    Ren, Yiming
    Tao, Jun
    Ren, Zhixiang
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 171
  • [48] Correlated Topic Model for Image Annotation
    Xu, Xing
    Shimada, Atsushi
    Taniguchi, Rin-ichiro
    [J]. PROCEEDINGS OF THE 19TH KOREA-JAPAN JOINT WORKSHOP ON FRONTIERS OF COMPUTER VISION (FCV 2013), 2013, : 201 - 208
  • [49] A Multi-modal Graphical Model for Scene Analysis
    Namin, Sarah Taghavi
    Najafi, Mohammad
    Salzmann, Mathieu
    Petersson, Lars
    [J]. 2015 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2015, : 1006 - 1013
  • [50] Multi-Modal Image Retrieval by Integrating Web Image Annotation, Concept Matching and Fuzzy Ranking Techniques
    Su, Ja-Hwung
    Wang, Bo-Wen
    Hsu, Tien-Yu
    Chou, Chien-Li
    Tseng, Vincent S.
    [J]. INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2010, 12 (02) : 136 - 149