Multimodal Consistency-Based Teacher for Semi-Supervised Multimodal Sentiment Analysis

被引:0
|
作者
Yuan, Ziqi [1 ]
Fang, Jingliang [1 ,2 ]
Xu, Hua [1 ,2 ]
Gao, Kai [3 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
[2] Samton Jiangxi Technol Dev Co Ltd, Nanchang 330036, Peoples R China
[3] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Sentiment analysis; Visualization; Training; Speech processing; Semisupervised learning; Image classification; Consistency-based semi-supervised learning; multimodal sentiment analysis; pseudo-label filtering;
D O I
10.1109/TASLP.2024.3430543
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Multimodal sentiment analysis holds significant importance within the realm of human-computer interaction. Due to the ease of collecting unlabeled online resources compared to the high costs associated with annotation, it becomes imperative for researchers to develop semi-supervised methods that leverage unlabeled data to enhance model performance. Existing semi-supervised approaches, particularly those applied to trivial image classification tasks, are not suitable for multimodal regression tasks due to their reliance on task-specific augmentation and thresholds designed for classification tasks. To address this limitation, we propose the Multimodal Consistency-based Teacher (MC-Teacher), which incorporates consistency-based pseudo-label technique into semi-supervised multimodal sentiment analysis. In our approach, we first propose synergistic consistency assumption which focus on the consistency among bimodal representation. Building upon this assumption, we develop a learnable filter network that autonomously learns how to identify misleading instances instead of threshold-based methods. This is achieved by leveraging both the implicit discriminant consistency on unlabeled instances and the explicit guidance on constructed training data with labeled instances. Additionally, we design the self-adaptive exponential moving average strategy to decouple the student and teacher networks, utilizing a heuristic momentum coefficient. Through both quantitative and qualitative experiments on two benchmark datasets, we demonstrate the outstanding performances of the proposed MC-Teacher approach. Furthermore, detailed analysis experiments and case studies are provided for each crucial component to intuitively elucidate the inner mechanism and further validate their effectiveness.
引用
收藏
页码:3669 / 3683
页数:15
相关论文
共 50 条
  • [21] Multimodal visual image processing of mobile robot in unstructured environment based on semi-supervised multimodal deep network
    Yajia Li
    Journal of Ambient Intelligence and Humanized Computing, 2020, 11 : 6349 - 6359
  • [22] LSTM Based Semi-supervised Attention Framework for Sentiment Analysis
    Ji, Hanxue
    Rong, Wenge
    Liu, Jingshuang
    Ouyang, Yuanxin
    Xiong, Zhang
    2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 1170 - 1177
  • [23] Semi-Supervised Learning for Aspect-Based Sentiment Analysis
    Zheng, Hang
    Zhang, Jianhui
    Suzuki, Yoshimi
    Fukumoto, Fumiyo
    Nishizaki, Hiromitsu
    2021 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW 2021), 2021, : 209 - 212
  • [24] A Semi-Supervised Image Registration Framework Based on Multimodal Cross-Attention
    Zhao, Ming
    Liu, Jingyi
    Wu, Yan
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [25] Semi-Supervised Encrypted Malicious Traffic Detection Based on Multimodal Traffic Characteristics
    Liu, Ming
    Yang, Qichao
    Wang, Wenqing
    Liu, Shengli
    SENSORS, 2024, 24 (20)
  • [26] Multimodal multilevel attention for semi-supervised skeleton-based gesture recognition
    Liu, Jinting
    Gan, Minggang
    He, Yuxuan
    Guo, Jia
    Hu, Kang
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (04)
  • [27] A hybrid semi-supervised boosting to sentiment analysis
    Tanha, Jafar
    Mahmudyan, Solmaz
    Farahi, Ahmad
    INTERNATIONAL JOURNAL OF NONLINEAR ANALYSIS AND APPLICATIONS, 2021, 12 (02): : 1769 - 1784
  • [28] SEMI-SUPERVISED MULTIMODAL IMAGE TRANSLATION FOR MISSING MODALITY IMPUTATION
    Sun, Wangbin
    Ma, Fei
    Li, Yang
    Huang, Shao-Lun
    Ni, Shiguang
    Zhang, Lin
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4320 - 4324
  • [29] Semi-supervised Multimodal Emotion Recognition With Improved Wasserstein GANs
    Liang, Jingjun
    Chen, Shizhe
    Jin, Qin
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 695 - 703
  • [30] Semi-Supervised Multimodal Representation Learning Through a Global Workspace
    Devillers, Benjamin
    Maytie, Leopold
    VanRullen, Rufin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,