Causal Inference for Modality Debiasing in Multimodal Emotion Recognition

被引:0
|
作者
Kim, Juyeon [1 ]
Hong, Juyoung [1 ]
Choi, Yukyung [1 ]
机构
[1] Sejong Univ, Dept Convergence Engn Intelligent Drone, Seoul 05006, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 23期
关键词
emotion recognition; multimodal learning; causal inference;
D O I
10.3390/app142311397
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Multimodal emotion recognition (MER) aims to enhance the understanding of human emotions by integrating visual, auditory, and textual modalities. However, previous MER approaches often depend on a dominant modality rather than considering all modalities, leading to poor generalization. To address this, we propose Causal Inference in Multimodal Emotion Recognition (CausalMER), which leverages counterfactual reasoning and causal graphs to capture relationships between modalities and reduce direct modality effects contributing to bias. This allows CausalMER to make unbiased predictions while being easily applied to existing MER methods in a model-agnostic manner, without requiring any architectural modifications. We evaluate CausalMER on the IEMOCAP and CMU-MOSEI datasets, widely used benchmarks in MER, and compare it with existing methods. On the IEMOCAP dataset with the MulT backbone, CausalMER achieves an average accuracy of 83.4%. On the CMU-MOSEI dataset, the average accuracies with MulT, PMR, and DMD backbones are 50.1%, 48.8%, and 48.8%, respectively. Experimental results demonstrate that CausalMER is robust in missing modality scenarios, as shown by its low standard deviation in performance drop gaps. Additionally, we evaluate modality contributions and show that CausalMER achieves balanced contributions from each modality, effectively mitigating direct biases from individual modalities.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Multimodal emotion recognition in audiovisual communication
    Schuller, B
    Lang, M
    Rigoll, G
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : 745 - 748
  • [32] Multimodal Emotion Recognition (MER) System
    Tang, Kevin
    Tie, Yun
    Yang, Truman
    Guan, Ling
    2014 IEEE 27TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2014,
  • [33] Decoupled Multimodal Distilling for Emotion Recognition
    Li, Yong
    Wang, Yuanzhi
    Cui, Zhen
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6631 - 6640
  • [34] Emotion Recognition Based on Multimodal Information
    Zeng, Zhihong
    Pantic, Maja
    Huang, Thomas S.
    AFFECTIVE INFORMATION PROCESSING, 2009, : 241 - +
  • [35] Survey on multimodal approaches to emotion recognition
    Gladys, A. Aruna
    Vetriselvi, V.
    NEUROCOMPUTING, 2023, 556
  • [36] Emotion Recognition Using Multimodal Approach
    Saini, Samiksha
    Rao, Rohan
    Vaichole, Vinit
    Rane, Anand
    Abin, Deepa
    2018 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION (ICCUBEA), 2018,
  • [37] A robust multimodal approach for emotion recognition
    Song, Mingli
    You, Mingyu
    Li, Na
    Chen, Chun
    NEUROCOMPUTING, 2008, 71 (10-12) : 1913 - 1920
  • [38] Outlier Processing in Multimodal Emotion Recognition
    Zhang, Ge
    Luo, Tianxiang
    Pedrycz, Witold
    El-Meligy, Mohammed A.
    Sharaf, Mohamed Abdel Fattah
    Li, Zhiwu
    IEEE ACCESS, 2020, 8 (08): : 55688 - 55701
  • [39] A Multimodal Corpus for Emotion Recognition in Sarcasm
    Ray, Anupama
    Mishra, Shubham
    Nunna, Apoorva
    Bhattacharyya, Pushpak
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6992 - 7003
  • [40] Multimodal human emotion/expression recognition
    Chen, LS
    Huang, TS
    Miyasato, T
    Nakatsu, R
    AUTOMATIC FACE AND GESTURE RECOGNITION - THIRD IEEE INTERNATIONAL CONFERENCE PROCEEDINGS, 1998, : 366 - 371