Topic Modeling of Multimodal Data: an Autoregressive Approach

被引:48
|
作者
Zheng, Yin [1 ]
Zhang, Yu-Jin [1 ]
Larochelle, Hugo [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
[2] Univ Sherbrooke, Dept Informat, Sherbrooke, PQ J1K 2R1, Canada
关键词
D O I
10.1109/CVPR.2014.178
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to deal with multimodal data, such as in image annotation tasks. Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for text document modeling. In this work, we show how to successfully apply and extend this model to multimodal data, such as simultaneous image classification and annotation. Specifically, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the hidden topic features by incorporating label information into the training objective of the model and show how to employ SupDocNADE to learn a joint representation from image visual words, annotation words and class label information. We also describe how to leverage information about the spatial position of the visual words for SupDocNADE to achieve better performance in a simple, yet effective manner. We test our model on the LabelMe and UIUC-Sports datasets and show that it compares favorably to other topic models such as the supervised variant of LDA and a Spatial Matching Pyramid (SPM) approach.
引用
收藏
页码:1370 / 1377
页数:8
相关论文
共 50 条
  • [21] Topic Detecting on Multimodal News Data Based on Deep Learning
    Ni L.
    Wu P.
    Zhou X.
    [J]. Data Analysis and Knowledge Discovery, 2024, 8 (03) : 85 - 97
  • [22] Bayesian nonparametric inference of latent topic hierarchies for multimodal data
    Shimamawari, Takuji
    Eguchi, Koji
    Takasu, Atsuhiro
    [J]. PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 236 - 240
  • [23] Multimodal Topic Modeling by Exploring Characteristics of Short Text Social Media
    Zhang, Huakui
    Cai, Yi
    Ren, Haopeng
    Li, Qing
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2430 - 2445
  • [24] DUET: Data-Driven Approach Based on Latent Dirichlet Allocation Topic Modeling
    Wang, Yan
    Taylor, John E.
    [J]. JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2019, 33 (03)
  • [25] Documents as data: A content analysis and topic modeling approach for analyzing responses to ecological disturbances
    Altaweel, Mark
    Bone, Christopher
    Abrams, Jesse
    [J]. ECOLOGICAL INFORMATICS, 2019, 51 : 82 - 95
  • [26] Multimodal Approach to Modeling of Manufacturing Processes
    Pawlewski, Pawel
    [J]. VARIETY MANAGEMENT IN MANUFACTURING: PROCEEDINGS OF THE 47TH CIRP CONFERENCE ON MANUFACTURING SYSTEMS, 2014, 17 : 716 - 720
  • [27] A multimodal approach for face modeling and recognition
    Mahoor, Mohammad H.
    Abdel-Mottaleb, Mohamed
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2008, 3 (03) : 431 - 440
  • [28] A contribution to a multimodal approach to knowledge modeling
    Guadagnin R.
    [J]. Pattern Recognition and Image Analysis, 2014, 24 (3) : 395 - 399
  • [29] Topic modeling approach to named entity linking
    Huai, Bao-Xing
    Bao, Teng-Fei
    Zhu, Heng-Shu
    Liu, Qi
    [J]. Liu, Qi, 1600, Chinese Academy of Sciences (25): : 2076 - 2087
  • [30] A multimodal approach for modeling engagement in conversation
    Pellet-Rostaing, Arthur
    Bertrand, Roxane
    Boudin, Auriane
    Rauzy, Stephane
    Blache, Philippe
    [J]. FRONTIERS IN COMPUTER SCIENCE, 2023, 5