Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels

被引:7
|
作者
Xu, Tianyuan [1 ]
Liu, Xueliang [1 ]
Huang, Zhen [2 ]
Guo, Dan [1 ]
Hong, Richang [1 ]
Wang, Meng [1 ]
机构
[1] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Hefei, Peoples R China
[2] Natl Univ Def Technol, Changsha, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Cross-Modal Retrieval; Learning from Noise; Contrastive Learning; Early-Learning Regularization;
D O I
10.1145/3503161.3548066
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cross modal retrieval receives intensive attention for flexible queries between different modalities. However, in practice it is challenging to retrieve cross modal content with noisy labels. The latest research on machine learning shows that a model tends to fit cleanly labeled data at early learning stage and then memorize the data with noisy labels. Although the clustering strategy in cross modal retrieval can be utilized for alleviating outliers, the networks will rapidly overfit after clean data is fitted well and the noisy labels begin to force the cluster center drift. Motivated by these fundamental phenomena, we propose an Early Learning regularized Contrastive Learning method for Cross Modal Retrieval with Noisy Labels (ELRCMR). In the solution, we propose to project the multi-modal data to a shared feature space by contrastive learning, in which early learning regularization is employed to prevent the memorization of noisy labels when training the model, and the dynamic weight balance strategy is employed to alleviate clustering drift. We evaluated the method with extensive experiments, and the result shows the proposed method could solve the cluster drift in conventional solutions and achieve promising performance on widely used benchmark datasets.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Cross-modal Contrastive Learning for Multimodal Fake News Detection
    Wang, Longzheng
    Zhang, Chuang
    Xu, Hongbo
    Xu, Yongxiu
    Xu, Xiaohan
    Wang, Siqi
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 5696 - 5704
  • [42] Improving Spoken Language Understanding with Cross-Modal Contrastive Learning
    Dong, Jingjing
    Fu, Jiayi
    Zhou, Peng
    Li, Hao
    Wang, Xiaorui
    [J]. INTERSPEECH 2022, 2022, : 2693 - 2697
  • [43] Enriched Music Representations With Multiple Cross-Modal Contrastive Learning
    Ferraro, Andres
    Favory, Xavier
    Drossos, Konstantinos
    Kim, Yuntae
    Bogdanov, Dmitry
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 733 - 737
  • [44] Learning Cross-Modal Contrastive Features for Video Domain Adaptation
    Kim, Donghyun
    Tsai, Yi-Hsuan
    Zhuang, Bingbing
    Yu, Xiang
    Sclaroff, Stan
    Saenko, Kate
    Chandraker, Manmohan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13598 - 13607
  • [45] Cross-Modal Contrastive Learning for Remote Sensing Image Classification
    Feng, Zhixi
    Song, Liangliang
    Yang, Shuyuan
    Zhang, Xinyu
    Jiao, Licheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [46] Category Alignment Adversarial Learning for Cross-Modal Retrieval
    He, Shiyuan
    Wang, Weiyang
    Wang, Zheng
    Xu, Xing
    Yang, Yang
    Wang, Xiaoming
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 4527 - 4538
  • [47] Adversarial cross-modal retrieval based on dictionary learning
    Shang, Fei
    Zhang, Huaxiang
    Zhu, Lei
    Sun, Jiande
    [J]. NEUROCOMPUTING, 2019, 355 : 93 - 104
  • [48] Heterogeneous Metric Learning for Cross-Modal Multimedia Retrieval
    Deng, Jun
    Du, Liang
    Shen, Yi-Dong
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT I, 2013, 8180 : 43 - 56
  • [49] Deep Multimodal Transfer Learning for Cross-Modal Retrieval
    Zhen, Liangli
    Hu, Peng
    Peng, Xi
    Goh, Rick Siow Mong
    Zhou, Joey Tianyi
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) : 798 - 810
  • [50] Wasserstein Coupled Graph Learning for Cross-Modal Retrieval
    Wang, Yun
    Zhang, Tong
    Zhang, Xueya
    Cui, Zhen
    Huang, Yuge
    Shen, Pengcheng
    Li, Shaoxin
    Yang, Jian
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1793 - 1802