A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data

被引:47
|
作者
Zheng, Yin [1 ]
Zhang, Yu-Jin [1 ]
Larochelle, Hugo [2 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[2] Univ Sherbrooke, Dept Informat, Sherbrooke, PQ J1K 2R1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Multimodal data modeling; topic model; neural autoregressive model; deep neural network;
D O I
10.1109/TPAMI.2015.2476802
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to deal with multimodal data, such as in image annotation tasks. Another popular approach to model the multimodal data is through deep neural networks, such as the deep Boltzmann machine (DBM). Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for text document modeling. In this work, we show how to successfully apply and extend this model to multimodal data, such as simultaneous image classification and annotation. First, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the learned hidden topic features and show how to employ it to learn a joint representation from image visual words, annotation words and class label information. We test our model on the LabelMe and UIUC-Sports data sets and show that it compares favorably to other topic models. Second, we propose a deep extension of our model and provide an efficient way of training the deep model. Experimental results show that our deep model outperforms its shallow version and reaches state-of-the-art performance on the Multimedia Information Retrieval (MIR) Flickr data set.
引用
收藏
页码:1056 / 1069
页数:14
相关论文
共 50 条
  • [31] Bayesian nonparametric inference of latent topic hierarchies for multimodal data
    Shimamawari, Takuji
    Eguchi, Koji
    Takasu, Atsuhiro
    [J]. PROCEEDINGS 3RD IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION ACPR 2015, 2015, : 236 - 240
  • [32] Deep Multimodal Fusion: A Hybrid Approach
    Amer, Mohamed R.
    Shields, Timothy
    Siddiquie, Behjat
    Tamrakar, Amir
    Divakaran, Ajay
    Chai, Sek
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (2-4) : 440 - 456
  • [33] Deep Multimodal Fusion: A Hybrid Approach
    Mohamed R. Amer
    Timothy Shields
    Behjat Siddiquie
    Amir Tamrakar
    Ajay Divakaran
    Sek Chai
    [J]. International Journal of Computer Vision, 2018, 126 : 440 - 456
  • [34] Adaptive framework for deep learning based dynamic and temporal topic modeling from big data
    Pathak, Ajeet R.
    Pandey, Manjusha
    Rautaray, Siddharth
    [J]. Recent Patents on Engineering, 2020, 14 (03): : 394 - 402
  • [35] Multimodal Topic Modeling by Exploring Characteristics of Short Text Social Media
    Zhang, Huakui
    Cai, Yi
    Ren, Haopeng
    Li, Qing
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 2430 - 2445
  • [36] DUET: Data-Driven Approach Based on Latent Dirichlet Allocation Topic Modeling
    Wang, Yan
    Taylor, John E.
    [J]. JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2019, 33 (03)
  • [37] Documents as data: A content analysis and topic modeling approach for analyzing responses to ecological disturbances
    Altaweel, Mark
    Bone, Christopher
    Abrams, Jesse
    [J]. ECOLOGICAL INFORMATICS, 2019, 51 : 82 - 95
  • [38] Scalable Deep Poisson Factor Analysis for Topic Modeling
    Gan, Zhe
    Chen, Changyou
    Henao, Ricardo
    Carlson, David
    Carin, Lawrence
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1823 - 1832
  • [39] Deep Topic Modeling by Multilayer Bootstrap Network and Lasso
    Wang, Jianyu
    Zhang, Xiao-Lei
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 2470 - 2475
  • [40] Neural Topic Modeling with Deep Mutual Information Estimation
    Xu, Kang
    Lu, Xiaoqiu
    Li, Yuan-fang
    Wu, Tongtong
    Qi, Guilin
    Ye, Ning
    Wang, Dong
    Zhou, Zheng
    [J]. BIG DATA RESEARCH, 2022, 30