Sparse Multi-Modal Topical Coding for Image Annotation

被引:11
|
作者
Song, Lingyun [1 ]
Luo, Minnan [1 ]
Liu, Jun [1 ]
Zhang, Lingling [1 ]
Qian, Buyue [1 ]
Li, Max Haifei [2 ]
Zheng, Qinghua [1 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, SPKLSTN Lab, Xian 710049, Peoples R China
[2] Union Univ, Dept Comp Sci, Jackson, TN 38305 USA
基金
美国国家科学基金会;
关键词
Topic models; Sparse latent representation; Image annotation; Image retrieval; REGULARIZATION; REPRESENTATION; COMPLETION;
D O I
10.1016/j.neucom.2016.06.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image annotation plays a significant role in large scale image understanding, indexing and retrieval. The Probability Topic Models (PTMs) attempt to address this issue by learning latent representations of input samples, and have been shown to be effective by existing studies. Though useful, PTM has some limitations in interpreting the latent representations of images and texts, which if addressed would broaden its applicability. In this paper, we introduce sparsity to PTM to improve the interpretability of the inferred latent representations. Extending the Sparse Topical Coding that originally designed for unimodal documents learning, we propose a non-probabilistic formulation of PTM for automatic image annotation, namely Sparse Multi-Modal Topical Coding. Beyond controlling the sparsity, our model can capture more compact correlations between words and image regions. Empirical results on some benchmark datasets show that our model achieves better performance on automatic image annotation and text-based image retrieval over the baseline models. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:162 / 174
页数:13
相关论文
共 50 条
  • [21] Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
    Niu, Yulei
    Lu, Zhiwu
    Wen, Ji-Rong
    Xiang, Tao
    Chang, Shih-Fu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1720 - 1731
  • [22] Multi-modal multi-concept-based deep neural network for automatic image annotation
    Xu, Haijiao
    Huang, Changqin
    Huang, Xiaodi
    Huang, Muxiong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (21) : 30651 - 30675
  • [23] Multi-modal multi-concept-based deep neural network for automatic image annotation
    Haijiao Xu
    Changqin Huang
    Xiaodi Huang
    Muxiong Huang
    Multimedia Tools and Applications, 2019, 78 : 30651 - 30675
  • [24] Multi-layer Group Sparse Coding - for Concurrent Image Classification and Annotation
    Gao, Shenghua
    Chia, Liang-Tien
    Tsang, Ivor Wai-Hung
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,
  • [25] Multi-Modal Medical Image Fusion With Geometric Algebra Based Sparse Representation
    Li, Yanping
    Fang, Nian
    Wang, Haiquan
    Wang, Rui
    FRONTIERS IN GENETICS, 2022, 13
  • [26] Multi-modal medical image fusion by Laplacian pyramid and adaptive sparse representation
    Wang, Zhaobin
    Cui, Zijing
    Zhu, Ying
    COMPUTERS IN BIOLOGY AND MEDICINE, 2020, 123
  • [27] A multi-modal image fusion framework based on guided filter and sparse representation
    Zhang, Shuai
    Huang, Fuyu
    Liu, Bingqi
    Li, Gang
    Chen, Yichao
    Chen, Yudan
    Zhou, Bing
    Wu, Dongsheng
    OPTICS AND LASERS IN ENGINEERING, 2021, 137
  • [28] MMnc: multi-modal interpretable representation for non-coding RNA classification and class annotation
    Creux, Constance
    Zehraoui, Farida
    Radvanyi, Francois
    Tahi, Fariza
    BIOINFORMATICS, 2025, 41 (03)
  • [29] Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model
    Yin, Chun-yan
    Chen, Yong-Heng
    Zuo, Wan-li
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2020, 30 (01) : 76 - 86
  • [30] Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model
    Chun-yan Yin
    Yong-Heng Chen
    Wan-li Zuo
    Pattern Recognition and Image Analysis, 2020, 30 : 76 - 86