Multimodal Phased Transformer for Sentiment Analysis

被引:0
|
作者
Cheng, Junyan [1 ]
Fostiropoulos, Iordanis [1 ]
Boehm, Barry [1 ]
Soleymani, Mohammad [2 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90007 USA
[2] USC Inst Creat Technol, Los Angeles, CA USA
关键词
REPRESENTATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal Transformers achieve superior performance in multimodal learning tasks. However, the quadratic complexity of the self-attention mechanism in Transformers limits their deployment in low-resource devices and makes their inference and training computationally expensive. We propose multimodal Sparse Phased Transformer (SPT) to alleviate the problem of self-attention complexity and memory footprint. SPT uses a sampling function to generate a sparse attention matrix and compress a long sequence to a shorter sequence of hidden states. SPT concurrently captures interactions between the hidden states of different modalities at every layer. To further improve the efficiency of our method, we use Layer-wise parameter sharing and Factorized Co-Attention that share parameters between Cross Attention Blocks, with minimal impact on task performance. We evaluate our model with three sentiment analysis datasets and achieve comparable or superior performance compared with the existing methods, with a 90% reduction in the number of parameters. We conclude that (SPT) along with parameter sharing can capture multimodal interactions with reduced model size and improved sample efficiency.
引用
收藏
页码:2447 / 2458
页数:12
相关论文
共 50 条
  • [1] Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis
    Wang, Yifeng
    He, Jiahao
    Wang, Di
    Wang, Quan
    Wan, Bo
    Luo, Xuemei
    [J]. NEUROCOMPUTING, 2024, 572
  • [2] AcFormer: An Aligned and Compact Transformer for Multimodal Sentiment Analysis
    Zong, Daoming
    Ding, Chaoyue
    Li, Baoxiang
    Li, Jiakui
    Zheng, Ken
    Zhou, Qunyan
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 833 - 842
  • [3] Hierarchical Interactive Multimodal Transformer for Aspect-Based Multimodal Sentiment Analysis
    Yu, Jianfei
    Chen, Kai
    Xia, Rui
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1966 - 1978
  • [4] Multimodal Sentiment Analysis Based on Interactive Transformer and Soft Mapping
    Li, Zuhe
    Guo, Qingbing
    Feng, Chengyao
    Deng, Lujuan
    Zhang, Qiuwen
    Zhang, Jianwei
    Wang, Fengqin
    Sun, Qian
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [5] TMBL: Transformer-based multimodal binding learning model for multimodal sentiment analysis
    Huang, Jiehui
    Zhou, Jun
    Tang, Zhenchao
    Lin, Jiaying
    Chen, Calvin Yu-Chian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 285
  • [6] TensorFormer: A Tensor-Based Multimodal Transformer for Multimodal Sentiment Analysis and Depression Detection
    Sun, Hao
    Chen, Yen-Wei
    Lin, Lanfen
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2776 - 2786
  • [7] MEDT: Using Multimodal Encoding-Decoding Network as in Transformer for Multimodal Sentiment Analysis
    Qi, Qingfu
    Lin, Liyuan
    Zhang, Rui
    Xue, Chengrong
    [J]. IEEE ACCESS, 2022, 10 : 28750 - 28759
  • [8] A Unimodal Reinforced Transformer With Time Squeeze Fusion for Multimodal Sentiment Analysis
    He, Jiaxuan
    Mai, Sijie
    Hu, Haifeng
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 992 - 996
  • [9] ICDN: integrating consistency and difference networks by transformer for multimodal sentiment analysis
    Zhang, Qiongan
    Shi, Lei
    Liu, Peiyu
    Zhu, Zhenfang
    Xu, Liancheng
    [J]. APPLIED INTELLIGENCE, 2023, 53 (12) : 16332 - 16345
  • [10] TETFN: A text enhanced transformer fusion network for multimodal sentiment analysis
    Wang, Di
    Guo, Xutong
    Tian, Yumin
    Liu, Jinhui
    He, LiHuo
    Luo, Xuemei
    [J]. PATTERN RECOGNITION, 2023, 136