Deep contrastive representation learning for multi-modal clustering

被引:0
|
作者
Lu, Yang [1 ,2 ]
Li, Qin [3 ]
Zhang, Xiangdong [1 ]
Gao, Quanxue [1 ]
机构
[1] Xidian Univ, Sch Telecommun Engn, Xian 710071, Shaanxi, Peoples R China
[2] Res Inst Air Firce, Beijing, Peoples R China
[3] Shenzhen Inst Informat Technol, Sch Software Engn, Shenzhen 518172, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view representation learning; Self-supervision; Clustering;
D O I
10.1016/j.neucom.2024.127523
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Benefiting from the informative expression capability of contrastive representation learning (CRL), recent multi -modal learning studies have achieved promising clustering performance. However, it should be pointed out that the existing multi -modal clustering methods based on CRL fail to simultaneously take the similarity information embedded in inter- and intra-modal levels. In this study, we mainly explore deep multi -modal contrastive representation learning, and present a multi -modal learning network, named trustworthy multimodal contrastive clustering (TMCC), which incorporates contrastive learning and adaptively reliable sample selection with multi -modal clustering. Specifically, we are concerned with an adaptive filter to learn TMCC via progressing from 'easy' to 'complex' samples. Based on this, with the highly confident clustering labels, we present a new contrastive loss to learn modal -consensus representation, which takes into account not only the inter -modal similarity but also the intra-modal similarity. Experimental results show that these principles in TMCC consistently help promote clustering performance improvement.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering
    Xia, Wei
    Wang, Tianxiu
    Gao, Quanxue
    Yang, Ming
    Gao, Xinbo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 (1170-1183) : 1170 - 1183
  • [2] Contrastive Multi-Modal Knowledge Graph Representation Learning
    Fang, Quan
    Zhang, Xiaowei
    Hu, Jun
    Wu, Xian
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (09) : 8983 - 8996
  • [3] CLMTR: a generic framework for contrastive multi-modal trajectory representation learning
    Liang, Anqi
    Yao, Bin
    Xie, Jiong
    Zheng, Wenli
    Shen, Yanyan
    Ge, Qiqi
    [J]. GEOINFORMATICA, 2024,
  • [4] Multi-Modal 3D Shape Clustering with Dual Contrastive Learning
    Lin, Guoting
    Zheng, Zexun
    Chen, Lin
    Qin, Tianyi
    Song, Jiahui
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (15):
  • [5] Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
    Liang, Weixin
    Zhang, Yuhui
    Kwon, Yongchan
    Yeung, Serena
    Zou, James
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] MCRLe: MULTI-MODAL CONTRASTIVE REPRESENTATION LEARNING FOR STROKE ONSET TIME DIAGNOSIS
    Liao, Weibin
    Jiang, Peirong
    Lv, Yi
    Xue, Yunjing
    Chen, Zhensen
    Li, Xuesong
    [J]. 2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [7] PromptLearner-CLIP: Contrastive Multi-Modal Action Representation Learning with Context Optimization
    Zheng, Zhenxing
    An, Gaoyun
    Cao, Shan
    Yang, Zhaoqilin
    Ruan, Qiuqi
    [J]. COMPUTER VISION - ACCV 2022, PT IV, 2023, 13844 : 554 - 570
  • [8] Multi-modal Network Representation Learning
    Zhang, Chuxu
    Jiang, Meng
    Zhang, Xiangliang
    Ye, Yanfang
    Chawla, Nitesh, V
    [J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3557 - 3558
  • [9] Multi-modal data clustering using deep learning: A systematic review
    Raya, Sura
    Orabi, Mariam
    Afyouni, Imad
    Al Aghbari, Zaher
    [J]. NEUROCOMPUTING, 2024, 607
  • [10] Deep Multi-modal Latent Representation Learning for Automated Dementia Diagnosis
    Zhou, Tao
    Liu, Mingxia
    Fu, Huazhu
    Wang, Jun
    Shen, Jianbing
    Shao, Ling
    Shen, Dinggang
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT IV, 2019, 11767 : 629 - 638