Multimodal Phased Transformer for Sentiment Analysis

被引：0

作者：

Cheng, Junyan ^{[1
]}

Fostiropoulos, Iordanis ^{[1
]}

Boehm, Barry ^{[1
]}

Soleymani, Mohammad ^{[2
]}

机构：

[1] Univ Southern Calif, Los Angeles, CA 90007 USA

[2] USC Inst Creat Technol, Los Angeles, CA USA

来源：

2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021) | 2021年

关键词：

REPRESENTATIONS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multimodal Transformers achieve superior performance in multimodal learning tasks. However, the quadratic complexity of the self-attention mechanism in Transformers limits their deployment in low-resource devices and makes their inference and training computationally expensive. We propose multimodal Sparse Phased Transformer (SPT) to alleviate the problem of self-attention complexity and memory footprint. SPT uses a sampling function to generate a sparse attention matrix and compress a long sequence to a shorter sequence of hidden states. SPT concurrently captures interactions between the hidden states of different modalities at every layer. To further improve the efficiency of our method, we use Layer-wise parameter sharing and Factorized Co-Attention that share parameters between Cross Attention Blocks, with minimal impact on task performance. We evaluate our model with three sentiment analysis datasets and achieve comparable or superior performance compared with the existing methods, with a 90% reduction in the number of parameters. We conclude that (SPT) along with parameter sharing can capture multimodal interactions with reduced model size and improved sample efficiency.

引用

页码：2447 / 2458

页数：12

共 50 条

[1] Multimodal transformer with adaptive modality weighting for multimodal sentiment analysis
Wang, Yifeng
He, Jiahao
Wang, Di
Wang, Quan
Wan, Bo
Luo, Xuemei
[J]. NEUROCOMPUTING, 2024, 572
[2] AcFormer: An Aligned and Compact Transformer for Multimodal Sentiment Analysis
Zong, Daoming
Ding, Chaoyue
Li, Baoxiang
Li, Jiakui
Zheng, Ken
Zhou, Qunyan
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 833 - 842
[3] Hierarchical Interactive Multimodal Transformer for Aspect-Based Multimodal Sentiment Analysis
Yu, Jianfei
Chen, Kai
Xia, Rui
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 1966 - 1978
[4] Multimodal Sentiment Analysis Based on Interactive Transformer and Soft Mapping
Li, Zuhe
Guo, Qingbing
Feng, Chengyao
Deng, Lujuan
Zhang, Qiuwen
Zhang, Jianwei
Wang, Fengqin
Sun, Qian
[J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[5] TMBL: Transformer-based multimodal binding learning model for multimodal sentiment analysis
Huang, Jiehui
Zhou, Jun
Tang, Zhenchao
Lin, Jiaying
Chen, Calvin Yu-Chian
[J]. KNOWLEDGE-BASED SYSTEMS, 2024, 285
[6] TensorFormer: A Tensor-Based Multimodal Transformer for Multimodal Sentiment Analysis and Depression Detection
Sun, Hao
Chen, Yen-Wei
Lin, Lanfen
[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2776 - 2786
[7] MEDT: Using Multimodal Encoding-Decoding Network as in Transformer for Multimodal Sentiment Analysis
Qi, Qingfu
Lin, Liyuan
Zhang, Rui
Xue, Chengrong
[J]. IEEE ACCESS, 2022, 10 : 28750 - 28759
[8] A Unimodal Reinforced Transformer With Time Squeeze Fusion for Multimodal Sentiment Analysis
He, Jiaxuan
Mai, Sijie
Hu, Haifeng
[J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 992 - 996
[9] ICDN: integrating consistency and difference networks by transformer for multimodal sentiment analysis
Zhang, Qiongan
Shi, Lei
Liu, Peiyu
Zhu, Zhenfang
Xu, Liancheng
[J]. APPLIED INTELLIGENCE, 2023, 53 (12) : 16332 - 16345
[10] TETFN: A text enhanced transformer fusion network for multimodal sentiment analysis
Wang, Di
Guo, Xutong
Tian, Yumin
Liu, Jinhui
He, LiHuo
Luo, Xuemei
[J]. PATTERN RECOGNITION, 2023, 136

← 1 2 3 4 5 →