Comprehensive Semi-Supervised Multi-Modal Learning

被引:0
|
作者
Yang, Yang [1 ]
Wang, Ke-Tao [1 ]
Zhan, De-Chuan [1 ]
Xiong, Hui [2 ]
Jiang, Yuan [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
[2] Rutgers State Univ, New Brunswick, NJ USA
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modal learning refers to the process of learning a precise model to represent the joint representations of different modalities. Despite its promise for multi-modal learning, the co-regularization method is based on the consistency principle with a sufficient assumption, which usually does not hold for real-world multi-modal data. Indeed, due to the modal insufficiency in real-world applications, there are divergences among heterogeneous modalities. This imposes a critical challenge for multi-modal learning. To this end, in this paper, we propose a novel Comprehensive Multi-Modal Learning (CMML) framework, which can strike a balance between the consistency and divergency modalities by considering the insufficiency in one unified framework. Specifically, we utilize an instance level attention mechanism to weight the sufficiency for each instance on different modalities. Moreover, novel diversity regularization and robust consistency metrics are designed for discovering insufficient modalities. Our empirical studies show the superior performances of CMML on real-world data in terms of various criteria.
引用
收藏
页码:4092 / 4098
页数:7
相关论文
共 50 条
  • [31] Human Activity Recognition Using Semi-supervised Multi-modal DEC for Instagram Data
    Kim, Dongmin
    Han, Sumin
    Son, Heesuk
    Lee, Dongman
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2020, PT I, 2020, 12084 : 869 - 880
  • [32] Multi-Modal Gaussian Process Latent Variable Model With Semi-Supervised Label Dequantization
    Maeda, Keisuke
    Matsumoto, Masanao
    Saito, Naoki
    Ogawa, Takahiro
    Haseyama, Miki
    IEEE ACCESS, 2024, 12 : 127244 - 127258
  • [33] Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation
    Zhang, Shuo
    Zhang, Jiaojiao
    Tian, Biao
    Lukasiewicz, Thomas
    Xu, Zhenghua
    MEDICAL IMAGE ANALYSIS, 2023, 83
  • [34] Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport
    Yang, Yang
    Fu, Zhao-Yang
    Zhan, De-Chuan
    Liu, Zhi-Bin
    Jiang, Yuan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (02) : 696 - 709
  • [35] Cancer immunotherapy response prediction from multi-modal clinical and image data using semi-supervised deep learning
    Wang, Xi
    Jiang, Yuming
    Chen, Hao
    Zhang, Taojun
    Han, Zhen
    Chen, Chuanli
    Yuan, Qingyu
    Xiong, Wenjun
    Wang, Wei
    Li, Guoxin
    Heng, Pheng-Ann
    Li, Ruijiang
    RADIOTHERAPY AND ONCOLOGY, 2023, 186
  • [36] Multi-Level Cross-Modal Interactive-Network-Based Semi-Supervised Multi-Modal Ship Classification
    Song, Xin
    Chen, Zhikui
    Zhong, Fangming
    Gao, Jing
    Zhang, Jianning
    Li, Peng
    SENSORS, 2024, 24 (22)
  • [37] DualSign: Semi-Supervised Sign Language Production with Balanced Multi-Modal Multi-Task Dual Transformation
    Huang, Wencan
    Zhao, Zhou
    He, Jinzheng
    Zhang, Mingmin
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5486 - 5495
  • [38] Semi-supervised Convolutional Neural Networks for Flood Mapping using Multi-modal Remote Sensing Data
    Viet-Hung Luu
    Minh-Son Dao
    Thi Nhat-Thanh Nguyen
    Perry, Stuart
    Zettsu, Koji
    PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 342 - 347
  • [39] Semi-supervised Label Generation for 3D Multi-modal MRI Bone Tumor Segmentation
    Curto-Vilalta, Anna
    Schlossmacher, Benjamin
    Valle, Christina
    Gersing, Alexandra
    Neumann, Jan
    von Eisenhart-Rothe, Ruediger
    Rueckert, Daniel
    Hinterwimmer, Florian
    JOURNAL OF IMAGING INFORMATICS IN MEDICINE, 2025,
  • [40] FEATURE INTEGRATION VIA SEMI-SUPERVISED ORDINALLY MULTI-MODAL GAUSSIAN PROCESS LATENT VARIABLE MODEL
    Kamikawa, Kyohei
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4130 - 4134