Deep contrastive representation learning for multi-modal clustering

被引:2
|
作者
Lu, Yang [1 ,2 ]
Li, Qin [3 ]
Zhang, Xiangdong [1 ]
Gao, Quanxue [1 ]
机构
[1] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China
[2] Res Inst Air Firce, Beijing, Peoples R China
[3] Shenzhen Inst Informat Technol, Sch Software Engn, Shenzhen 518172, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view representation learning; Self-supervision; Clustering;
D O I
10.1016/j.neucom.2024.127523
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Benefiting from the informative expression capability of contrastive representation learning (CRL), recent multi -modal learning studies have achieved promising clustering performance. However, it should be pointed out that the existing multi -modal clustering methods based on CRL fail to simultaneously take the similarity information embedded in inter- and intra-modal levels. In this study, we mainly explore deep multi -modal contrastive representation learning, and present a multi -modal learning network, named trustworthy multimodal contrastive clustering (TMCC), which incorporates contrastive learning and adaptively reliable sample selection with multi -modal clustering. Specifically, we are concerned with an adaptive filter to learn TMCC via progressing from 'easy' to 'complex' samples. Based on this, with the highly confident clustering labels, we present a new contrastive loss to learn modal -consensus representation, which takes into account not only the inter -modal similarity but also the intra-modal similarity. Experimental results show that these principles in TMCC consistently help promote clustering performance improvement.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Multi-modal brain tumor segmentation via disentangled representation learning and region-aware contrastive learning
    Zhou, Tongxue
    PATTERN RECOGNITION, 2024, 149 (149)
  • [22] Improving Code Search with Multi-Modal Momentum Contrastive Learning
    Shi, Zejian
    Xiong, Yun
    Zhang, Yao
    Jiang, Zhijie
    Zhao, Jinjing
    Wang, Lei
    Li, Shanshan
    2023 IEEE/ACM 31ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2023, : 280 - 291
  • [23] Improving Medical Multi-modal Contrastive Learning with Expert Annotations
    Kumar, Yogesh
    Marttinen, Pekka
    COMPUTER VISION - ECCV 2024, PT XX, 2025, 15078 : 468 - 486
  • [24] CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
    Zolfaghari, Mohammadreza
    Zhu, Yi
    Gehler, Peter
    Brox, Thomas
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1430 - 1439
  • [25] Multi-Modal Deep Clustering: Unsupervised Partitioning of Images
    Shiran, Guy
    Weinshall, Daphna
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4728 - 4735
  • [26] Multi-modal hypergraph contrastive learning for medical image segmentation
    Jing, Weipeng
    Wang, Junze
    Di, Donglin
    Li, Dandan
    Song, Yang
    Fan, Lei
    PATTERN RECOGNITION, 2025, 165
  • [27] CrossMoCo: Multi-modal Momentum Contrastive Learning for Point Cloud
    Paul, Sneha
    Patterson, Zachary
    Bouguila, Nizar
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 273 - 280
  • [28] Collaborative denoised graph contrastive learning for multi-modal recommendation
    Xu, Fuyong
    Zhu, Zhenfang
    Fu, Yixin
    Wang, Ru
    Liu, Peiyu
    INFORMATION SCIENCES, 2024, 679
  • [29] Multi-modal contrastive learning of subcellular organization using DICE
    Nasser, Rami
    Schaffer, Leah, V
    Ideker, Trey
    Sharan, Roded
    BIOINFORMATICS, 2024, 40 : ii105 - ii110
  • [30] Deep Contrastive Multi-View Subspace Clustering With Representation and Cluster Interactive Learning
    Yu, Xuejiao
    Jiang, Yi
    Chao, Guoqing
    Chu, Dianhui
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 188 - 199