Hierarchical Multimodal Fusion Network with Dynamic Multi-task Learning

被引:1
|
作者
Wang, Tianyi [1 ]
Chen, Shu-Ching [1 ]
机构
[1] Florida Int Univ, Knight Fdn Sch Comp & Informat Sci, Miami, FL 33199 USA
关键词
hierarchical multimodal fusion; graph fusion; multi-task learning;
D O I
10.1109/IRI51335.2021.00034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-world data often contain multiple modalities and non-exclusive labels. Multimodal fusion is a vital step in multimodal learning that integrates features from various modalities in the vector space so that the classifier could utilize the fused vector to generate the final prediction score. Common multimodal fusion approaches rarely consider the cross-modality interactions which play an essential role in exploiting the inter-modality relationship and subsequently creating the joint modality embedding. In this paper, we propose a hierarchical multimodal fusion framework with dynamic multi-task learning. It focuses on modeling the joint embedding space for all cross-modality interactions and adjusting the task loss for optimal performance. The proposed model uses a novel hierarchical multimodal fusion network that learns the cross-modal interactions among all combinations of modalities and dynamically allocates the weights for each pair in a sample-aware fashion. Furthermore, a novel dynamic multi-task learning approach is applied to handle the multi-label problems by automatically adjusting the learning progress on both task level and sample level. We show that the proposed framework outperforms the baselines and some of the state-of-the-art methods. We also demonstrate the flexibility and modularity of the proposed hierarchical multimodal fusion and dynamic multi-task learning units, which can be applied to various types of networks.
引用
收藏
页码:208 / 214
页数:7
相关论文
共 50 条
  • [1] Multi-task Hierarchical Heterogeneous Fusion Framework for multimodal summarization
    Zhang, Litian
    Zhang, Xiaoming
    Han, Linfeng
    Yu, Zelong
    Liu, Yun
    Li, Zhoujun
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (04)
  • [2] Multi-Task Learning and Multimodal Fusion for Road Segmentation
    Cheng, Bowen
    Tian, Miaomiao
    Jiang, Shuai
    Liu, Weiwei
    Pang, Yalong
    [J]. IEEE ACCESS, 2023, 11 : 18947 - 18959
  • [3] A cross modal hierarchical fusion multimodal sentiment analysis method based on multi-task learning
    Wang, Lan
    Peng, Junjie
    Zheng, Cangzhi
    Zhao, Tong
    Zhu, Li'an
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
  • [4] Multi-task disagreement-reducing multimodal sentiment fusion network
    Wang, Zijun
    Jiang, Naicheng
    Chao, Xinyue
    Sun, Bin
    [J]. IMAGE AND VISION COMPUTING, 2024, 149
  • [5] Multi-Task Hierarchical Learning Based Network Traffic Analytics
    Barut, Onur
    Luo, Yan
    Zhang, Tong
    Li, Weigang
    Li, Peilong
    [J]. IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2021), 2021,
  • [6] On Exploiting Network Topology for Hierarchical Coded Multi-Task Learning
    Hu, Haoyang
    Li, Songze
    Cheng, Minquan
    Ma, Shuai
    Shi, Yuanming
    Wu, Youlong
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (08) : 4930 - 4944
  • [7] Dynamic Multi-Task Learning with Convolutional Neural Network
    Fang, Yuchun
    Ma, Zhengyan
    Zhang, Zhaoxiang
    Zhang, Xu-Yao
    Bai, Xiang
    [J]. PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1668 - 1674
  • [8] Hierarchical Prompt Learning for Multi-Task Learning
    Liu, Yajing
    Lu, Yuning
    Liu, Hao
    An, Yaozu
    Xu, Zhuoran
    Yao, Zhuokun
    Zhang, Baofeng
    Xiong, Zhiwei
    Gui, Chenguang
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10888 - 10898
  • [9] A text guided multi-task learning network for multimodal sentiment analysis
    Luo, Yuanyi
    Wu, Rui
    Liu, Jiafeng
    Tang, Xianglong
    [J]. NEUROCOMPUTING, 2023, 560
  • [10] Gated hierarchical multi-task learning network for judicial decision prediction
    Yao, Fanglong
    Sun, Xian
    Yu, Hongfeng
    Yang, Yang
    Zhang, Wenkai
    Fu, Kun
    [J]. NEUROCOMPUTING, 2020, 411 : 313 - 326