Representation learning using step-based deep multi-modal autoencoders

被引:15
|
作者
Bhatt, Gaurav [1 ]
Jha, Piyush [2 ]
Raman, Balasubramanian [1 ]
机构
[1] IITR, Roorkee 247667, Uttar Pradesh, India
[2] MNIT, Jaipur 302017, Rajasthan, India
关键词
Representation learning; Transfer learning; Convolution autoencoders; Multilingual document classification; CANONICAL CORRELATION-ANALYSIS; CLASSIFICATION; SUBSPACE; NETWORK;
D O I
10.1016/j.patcog.2019.05.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning techniques have been successfully used in learning a common representation for multi view data, wherein different modalities are projected onto a common subspace. In a broader perspective, the techniques used to investigate common representation learning falls under the categories of 'canonical correlation-based' approaches and 'autoencoder-based' approaches. In this paper, we investigate the performance of deep autoencoder-based methods on multi-view data. We propose a novel step-based correlation multi-modal deep convolution neural network (CorrMCNN) which reconstructs one view of the data given the other while increasing the interaction between the representations at each hidden layer or every intermediate step. The idea of step reconstruction reduces the constraint of reconstruction of original data, instead, the objective function is optimized for reconstruction of representative features. This helps the proposed model to generalize for representation and transfer learning tasks efficiently for high dimensional data. Finally, we evaluate the performance of the proposed model on three multi-view and cross-modal problems viz., audio articulation, cross-modal image retrieval and multilingual (cross-language) document classification. Through extensive experiments, we find that the proposed model performs much better than the current state-of-the-art deep learning techniques on all three multi-view and cross-modal tasks. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:12 / 23
页数:12
相关论文
共 50 条
  • [1] Common Representation Learning Using Step-based Correlation Multi-Modal CNN
    Bhatt, Gaurav
    Jha, Piyush
    Raman, Balasubramanian
    [J]. PROCEEDINGS 2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR), 2017, : 864 - 869
  • [2] Deep contrastive representation learning for multi-modal clustering
    Lu, Yang
    Li, Qin
    Zhang, Xiangdong
    Gao, Quanxue
    [J]. NEUROCOMPUTING, 2024, 581
  • [3] Bayesian mixture variational autoencoders for multi-modal learning
    Keng-Te Liao
    Bo-Wei Huang
    Chih-Chun Yang
    Shou-De Lin
    [J]. Machine Learning, 2022, 111 : 4329 - 4357
  • [4] Bayesian mixture variational autoencoders for multi-modal learning
    Liao, Keng-Te
    Huang, Bo-Wei
    Yang, Chih-Chun
    Lin, Shou-De
    [J]. MACHINE LEARNING, 2022, 111 (12) : 4329 - 4357
  • [5] Multi-modal Network Representation Learning
    Zhang, Chuxu
    Jiang, Meng
    Zhang, Xiangliang
    Ye, Yanfang
    Chawla, Nitesh, V
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3557 - 3558
  • [6] Deep Multi-modal Latent Representation Learning for Automated Dementia Diagnosis
    Zhou, Tao
    Liu, Mingxia
    Fu, Huazhu
    Wang, Jun
    Shen, Jianbing
    Shao, Ling
    Shen, Dinggang
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 629 - 638
  • [7] Memory based fusion for multi-modal deep learning
    Priyasad, Darshana
    Fernando, Tharindu
    Denman, Simon
    Sridharan, Sridha
    Fookes, Clinton
    [J]. INFORMATION FUSION, 2021, 67 : 136 - 146
  • [8] Multi-modal multi-step wind power forecasting based on stacking deep learning model
    Xing, Zhikai
    He, Yigang
    [J]. RENEWABLE ENERGY, 2023, 215
  • [9] Mineral: Multi-modal Network Representation Learning
    Kefato, Zekarias T.
    Sheikh, Nasrullah
    Montresor, Alberto
    [J]. MACHINE LEARNING, OPTIMIZATION, AND BIG DATA, MOD 2017, 2018, 10710 : 286 - 298
  • [10] Deep multi-modal learning for joint linear representation of nonlinear dynamical systems
    Qian, Shaodi
    Chou, Chun-An
    Li, Jr-Shin
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)