Deep contrastive representation learning for multi-modal clustering

被引:2
|
作者
Lu, Yang [1 ,2 ]
Li, Qin [3 ]
Zhang, Xiangdong [1 ]
Gao, Quanxue [1 ]
机构
[1] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China
[2] Res Inst Air Firce, Beijing, Peoples R China
[3] Shenzhen Inst Informat Technol, Sch Software Engn, Shenzhen 518172, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view representation learning; Self-supervision; Clustering;
D O I
10.1016/j.neucom.2024.127523
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Benefiting from the informative expression capability of contrastive representation learning (CRL), recent multi -modal learning studies have achieved promising clustering performance. However, it should be pointed out that the existing multi -modal clustering methods based on CRL fail to simultaneously take the similarity information embedded in inter- and intra-modal levels. In this study, we mainly explore deep multi -modal contrastive representation learning, and present a multi -modal learning network, named trustworthy multimodal contrastive clustering (TMCC), which incorporates contrastive learning and adaptively reliable sample selection with multi -modal clustering. Specifically, we are concerned with an adaptive filter to learn TMCC via progressing from 'easy' to 'complex' samples. Based on this, with the highly confident clustering labels, we present a new contrastive loss to learn modal -consensus representation, which takes into account not only the inter -modal similarity but also the intra-modal similarity. Experimental results show that these principles in TMCC consistently help promote clustering performance improvement.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Editorial for Special Issue on Multi-modal Representation Learning
    Fan, Deng-Ping
    Barnes, Nick
    Cheng, Ming-Ming
    Van Gool, Luc
    MACHINE INTELLIGENCE RESEARCH, 2024, 21 (04) : 615 - 616
  • [42] Multi-modal Graph Contrastive Learning for Micro-video Recommendation
    Yi, Zixuan
    Wang, Xi
    Ounis, Iadh
    Macdonald, Craig
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1807 - 1811
  • [43] Reliable multi-modal prototypical contrastive learning for difficult airway assessment
    Li, Xiaofan
    Peng, Bo
    Yao, Yuan
    Zhang, Guangchao
    Xie, Zhuyang
    Saleem, Muhammad Usman
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 273
  • [44] A Hard Negatives Mining and Enhancing Method for Multi-Modal Contrastive Learning
    Li, Guangping
    Gao, Yanan
    Huang, Xianhui
    Ling, Bingo Wing-Kuen
    ELECTRONICS, 2025, 14 (04):
  • [45] ConOffense: Multi-modal multitask Contrastive learning for offensive content identification
    Shome, Debaditya
    Kar, T.
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4524 - 4529
  • [46] A deep contrastive multi-modal encoder for multi-omics data integration and analysis
    Yinghua, Ma
    Khan, Ahmad
    Heng, Yang
    Khan, Fiaz Gul
    Ali, Farman
    Al-Otaibi, Yasser D.
    Bashir, Ali Kashif
    INFORMATION SCIENCES, 2025, 700
  • [47] MultiCAD: Contrastive Representation Learning for Multi-modal 3D Computer-Aided Design Models
    Ma, Weijian
    Xu, Minyang
    Li, Xueyang
    Zhou, Xiangdong
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 1766 - 1776
  • [48] Hierarchical sparse representation with deep dictionary for multi-modal classification
    Wang, Zhengxia
    Teng, Shenghua
    Liu, Guodong
    Zhao, Zengshun
    Wu, Hongli
    NEUROCOMPUTING, 2017, 253 : 65 - 69
  • [49] Multi-scale and multi-modal contrastive learning network for biomedical time series
    Guo, Hongbo
    Xu, Xinzi
    Wu, Hao
    Liu, Bin
    Xia, Jiahui
    Cheng, Yibang
    Guo, Qianhui
    Chen, Yi
    Xu, Tingyan
    Wang, Jiguang
    Wang, Guoxing
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 106
  • [50] Multi-Modal Song Mood Detection with Deep Learning
    Pyrovolakis, Konstantinos
    Tzouveli, Paraskevi
    Stamou, Giorgos
    SENSORS, 2022, 22 (03)