Efficient Low-rank Multimodal Fusion with Modality-Specific Factors

被引:0
|
作者
Liu, Zhun [1 ]
Shen, Ying [1 ]
Lakshminarasimhan, Varun Bharadhwaj [1 ]
Liang, Paul Pu [1 ]
Zadeh, Amir [1 ]
Morency, Louis-Philippe [1 ]
机构
[1] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multimodal research is an emerging field of artificial intelligence, and one of the main research problems in this field is multimodal fusion. The fusion of multimodal data is the process of integrating multiple unimodal representations into one compact multimodal representation. Previous research in this field has exploited the expressiveness of tensors for multimodal representation. However, these methods often suffer from exponential increase in dimensions and in computational complexity introduced by transformation of input into tensor. In this paper, we propose the Low-rank Multimodal Fusion method, which performs multimodal fusion using low-rank tensors to improve efficiency. We evaluate our model on three different tasks: multimodal sentiment analysis, speaker trait analysis, and emotion recognition. Our model achieves competitive results on all these tasks while drastically reducing computational complexity. Additional experiments also show that our model can perform robustly for a wide range of low-rank settings, and is indeed much more efficient in both training and inference compared to other methods that utilize tensor representations.
引用
收藏
页码:2247 / 2256
页数:10
相关论文
共 50 条
  • [1] Is susceptibility to perceptual migration and fusion modality-specific or multimodal?
    Marcel, A
    Mackintosh, B
    Postma, P
    Cusack, R
    Vuckovich, J
    Nimmo-Smith, I
    Cox, SML
    NEUROPSYCHOLOGIA, 2006, 44 (05) : 693 - 710
  • [2] Dual Low-Rank Multimodal Fusion
    Jin, Tao
    Huang, Siyu
    Li, Yingming
    Zhang, Zhongfei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 377 - 387
  • [3] Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion
    Yao, Yiqun
    Mihalcea, Rada
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1824 - 1834
  • [4] Low-rank Multimodal Fusion Algorithm Based on Context Modeling
    Bai, Zongwen
    Chen, Xiaohuan
    Zhou, Meili
    Yi, Tingting
    Chien, Wei-Che
    JOURNAL OF INTERNET TECHNOLOGY, 2021, 22 (04): : 913 - 921
  • [5] AdaMoW: Multimodal Sentiment Analysis Based on Adaptive Modality-Specific Weight Fusion Network
    Zhang, Junling
    Wu, Xuemei
    Huang, Changqin
    IEEE ACCESS, 2023, 11 : 48410 - 48420
  • [6] Efficient low-rank multi-component fusion with component-specific factors in image-recipe retrieval
    Zhao, Wenyu
    Zhou, Dong
    Cao, Buqing
    Zhang, Kai
    Chen, Jinjun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (2) : 3601 - 3619
  • [7] Efficient low-rank multi-component fusion with component-specific factors in image-recipe retrieval
    Wenyu Zhao
    Dong Zhou
    Buqing Cao
    Kai Zhang
    Jinjun Chen
    Multimedia Tools and Applications, 2024, 83 : 3601 - 3619
  • [8] Sustained Spatial Attention in Touch: Modality-Specific and Multimodal Mechanisms
    Sambo, Chiara F.
    Forster, Bettina
    THESCIENTIFICWORLDJOURNAL, 2011, 11 : 199 - 213
  • [9] Multimodal Medical Image Fusion Based on Multiple Latent Low-Rank Representation
    Lou, Xi-Cheng
    Feng, Xin
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
  • [10] Sparse Low-Rank Fusion based Deep Features for Missing Modality Face Recognition
    Shao, Ming
    Ding, Zhengming
    Fu, Yun
    2015 11TH IEEE INTERNATIONAL CONFERENCE AND WORKSHOPS ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG), VOL. 1, 2015,