Topic Regression Multi-Modal Latent Dirichlet Allocation for Image Annotation

被引:106
|
作者
Putthividhya, Duangmanee [1 ]
Attias, Hagai T. [2 ]
Nagarajan, Srikantan S. [3 ]
机构
[1] UCSD, 9500 Gilman Dr, La Jolla, CA 92307 USA
[2] Golden Metall Inc, San Francisco 91147, CA USA
[3] Univ Calif San Francisco, San Francisco, CA 94143 USA
关键词
D O I
10.1109/CVPR.2010.5540000
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present topic-regression multi-modal Latent Dirichlet Allocation (tr-mmLDA), a novel statistical topic model for the task of image and video annotation. At the heart of our new annotation model lies a novel latent variable regression approach to capture correlations between image or video features and annotation texts. Instead of sharing a set of latent topics between the 2 data modalities as in the formulation of correspondence LDA in [2], our approach introduces a regression module to correlate the 2 sets of topics, which captures more general forms of association and allows the number of topics in the 2 data modalities to be different. We demonstrate the power of tr-mmLDA on 2 standard annotation datasets: a 5000-image subset of COREL and a 2687-image LabelMe dataset. The proposed association model shows improved performance over correspondence LDA as measured by caption perplexity.
引用
收藏
页码:3408 / 3415
页数:8
相关论文
共 50 条
  • [1] MMDF-LDA: An improved Multi-Modal Latent Dirichlet Allocation model for social image annotation
    Liu Zheng
    Zhang Caiming
    Chen Caixian
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 104 : 168 - 184
  • [2] SUPERVISED MULTI-MODAL TOPIC MODEL FOR IMAGE ANNOTATION
    Tran, Thu Hoai
    Choi, Seungjin
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] A Multi-Modal Topic Model for Image Annotation Using Text Analysis
    Tian, Jing
    Huang, Yu
    Guo, Zhi
    Qi, Xiang
    Chen, Ziyan
    Huang, Tinglei
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (07) : 886 - 890
  • [4] Boosted Multi-Modal Supervised Latent Dirichlet Allocation for Social Event Classification
    Qian, Shengsheng
    Zhang, Tianzhu
    Xu, Changsheng
    [J]. 2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 1999 - 2004
  • [5] Pseudo-Supervised Latent Dirichlet Allocation for Image Annotation
    Pham, Huong Thi
    Choi, Seungjin
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 1924 - 1929
  • [6] The Auto Annotation Latent Dirichlet Allocation
    Xiang, Yingzhuo
    Yang, Dongmei
    Yan, Jikun
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INFORMATION SCIENCES, MACHINERY, MATERIALS AND ENERGY (ICISMME 2015), 2015, 126 : 1908 - 1911
  • [7] Topic Selection in Latent Dirichlet Allocation
    Wang, Biao
    Liu, Zelong
    Li, Maozhen
    Liu, Yang
    Qi, Man
    [J]. 2014 11TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2014, : 756 - 760
  • [8] Sparse Multi-Modal Topical Coding for Image Annotation
    Song, Lingyun
    Luo, Minnan
    Liu, Jun
    Zhang, Lingling
    Qian, Buyue
    Li, Max Haifei
    Zheng, Qinghua
    [J]. NEUROCOMPUTING, 2016, 214 : 162 - 174
  • [9] Multi-modal feature fusion for geographic image annotation
    Li, Ke
    Zou, Changqing
    Bu, Shuhui
    Liang, Yun
    Zhang, Jian
    Gong, Minglun
    [J]. PATTERN RECOGNITION, 2018, 73 : 1 - 14
  • [10] Max-Margin Latent Dirichlet Allocation for Image Classification and Annotation
    Wang, Yang
    Mori, Greg
    [J]. PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,