Deep contrastive representation learning for multi-modal clustering

被引：2

作者：

Lu, Yang ^{[1
,2
]}

Li, Qin ^{[3
]}

Zhang, Xiangdong ^{[1
]}

Gao, Quanxue ^{[1
]}

机构：

[1] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China

[2] Res Inst Air Firce, Beijing, Peoples R China

[3] Shenzhen Inst Informat Technol, Sch Software Engn, Shenzhen 518172, Peoples R China

来源：

NEUROCOMPUTING | 2024年 / 581卷

基金：

中国国家自然科学基金;

关键词：

Multi-view representation learning; Self-supervision; Clustering;

D O I：

10.1016/j.neucom.2024.127523

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Benefiting from the informative expression capability of contrastive representation learning (CRL), recent multi -modal learning studies have achieved promising clustering performance. However, it should be pointed out that the existing multi -modal clustering methods based on CRL fail to simultaneously take the similarity information embedded in inter- and intra-modal levels. In this study, we mainly explore deep multi -modal contrastive representation learning, and present a multi -modal learning network, named trustworthy multimodal contrastive clustering (TMCC), which incorporates contrastive learning and adaptively reliable sample selection with multi -modal clustering. Specifically, we are concerned with an adaptive filter to learn TMCC via progressing from 'easy' to 'complex' samples. Based on this, with the highly confident clustering labels, we present a new contrastive loss to learn modal -consensus representation, which takes into account not only the inter -modal similarity but also the intra-modal similarity. Experimental results show that these principles in TMCC consistently help promote clustering performance improvement.

引用

页数：8

共 50 条

[21] Multi-modal brain tumor segmentation via disentangled representation learning and region-aware contrastive learning
Zhou, Tongxue
PATTERN RECOGNITION, 2024, 149 (149)
[22] Improving Code Search with Multi-Modal Momentum Contrastive Learning
Shi, Zejian
Xiong, Yun
Zhang, Yao
Jiang, Zhijie
Zhao, Jinjing
Wang, Lei
Li, Shanshan
2023 IEEE/ACM 31ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION, ICPC, 2023, : 280 - 291
[23] Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Kumar, Yogesh
Marttinen, Pekka
COMPUTER VISION - ECCV 2024, PT XX, 2025, 15078 : 468 - 486
[24] CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
Zolfaghari, Mohammadreza
Zhu, Yi
Gehler, Peter
Brox, Thomas
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1430 - 1439
[25] Multi-Modal Deep Clustering: Unsupervised Partitioning of Images
Shiran, Guy
Weinshall, Daphna
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4728 - 4735
[26] Multi-modal hypergraph contrastive learning for medical image segmentation
Jing, Weipeng
Wang, Junze
Di, Donglin
Li, Dandan
Song, Yang
Fan, Lei
PATTERN RECOGNITION, 2025, 165
[27] CrossMoCo: Multi-modal Momentum Contrastive Learning for Point Cloud
Paul, Sneha
Patterson, Zachary
Bouguila, Nizar
2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 273 - 280
[28] Collaborative denoised graph contrastive learning for multi-modal recommendation
Xu, Fuyong
Zhu, Zhenfang
Fu, Yixin
Wang, Ru
Liu, Peiyu
INFORMATION SCIENCES, 2024, 679
[29] Multi-modal contrastive learning of subcellular organization using DICE
Nasser, Rami
Schaffer, Leah, V
Ideker, Trey
Sharan, Roded
BIOINFORMATICS, 2024, 40 : ii105 - ii110
[30] Deep Contrastive Multi-View Subspace Clustering With Representation and Cluster Interactive Learning
Yu, Xuejiao
Jiang, Yi
Chao, Guoqing
Chu, Dianhui
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 188 - 199

← 1 2 3 4 5 →