Towards a rigorous analysis of mutual information in contrastive learning

被引:0
|
作者
Lee, Kyungeun [1 ,4 ]
Kim, Jaeill [1 ]
Kang, Suhyun [1 ]
Rhee, Wonjong [1 ,2 ,3 ]
机构
[1] Seoul Natl Univ, Dept Intelligence & Informat, 1 Gwanak Ro, Seoul 08826, South Korea
[2] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, 1 Gwanak Ro, Seoul 08826, South Korea
[3] Seoul Natl Univ, AI Inst, 1 Gwanak Ro, Seoul 08826, South Korea
[4] LG AI Res, 150 Magokjungang Ro, Seoul 07789, South Korea
基金
新加坡国家研究基金会;
关键词
Representation learning; Contrastive learning; Mutual information; Unsupervised learning;
D O I
10.1016/j.neunet.2024.106584
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive learning has emerged as a cornerstone in unsupervised representation learning. Its primary paradigm involves an instance discrimination task utilizing InfoNCE loss where the loss has been proven to be a form of mutual information. Consequently, it has become a common practice to analyze contrastive learning using mutual information as a measure. Yet, this analysis approach presents difficulties due to the necessity of estimating mutual information for real-world applications. This creates a gap between the elegance of its mathematical foundation and the complexity of its estimation, thereby hampering the ability to derive solid and meaningful insights from mutual information analysis. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating the capacity of the proposed methods to facilitate deeper comprehension or to rectify pre-existing misconceptions. The main results can be summarized as follows: (1) While small batch sizes influence the range of training loss, they do not inherently limit learned representation's information content or affect downstream performance adversely; (2) Mutual information, with careful selection of positive pairings and post-training estimation, proves to be a superior measure for evaluating practical networks; and (3) Distinguishing between task-relevant and irrelevant information presents challenges, yet irrelevant information sources do not necessarily compromise the generalization of downstream tasks.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Contrastive Mutual Learning With Pseudo-Label Smoothing for Hyperspectral Image Classification
    Liu, Lizhu
    Zhang, Hui
    Wang, Yaonan
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 14
  • [42] Contrastive Learning for API Aspect Analysis
    Shahariar, G. M.
    Hasan, Tahmid
    Iqbal, Anindya
    Uddin, Gias
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 637 - 648
  • [43] An EEG Transfer Learning Algorithm Based on Mutual Information and Transfer Component Analysis
    Hu, Cungang
    Cai, Jicheng
    Liang, Zilin
    Wang, Kai
    Zhang, Yue
    Chen, Weihai
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 949 - 954
  • [44] Towards and Efficient Algorithm for Computing the Reduced Mutual Information
    Renedo-Mirambell, Mart Prime I.
    Arratia, Argimiro
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2022, 356 : 168 - 171
  • [45] Towards Adversarial Robustness with Multidimensional Perturbations via Contrastive Learning
    Chen, Chuanxi
    Ye, Dengpan
    Wang, Hao
    Tang, Long
    Xu, Yue
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 184 - 191
  • [46] Towards Powerful Graph Contrastive Learning without Negative Examples
    Cen, Keting
    Shen, Huawei
    Cao, Qi
    Xu, Bingbing
    Cheng, Xueqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [47] Towards Robust Rumor Detection with Graph Contrastive and Curriculum Learning
    Zhuang, Wen-Ming
    Chen, Chih-Yao
    Li, Cheng-Te
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2024, 18 (07)
  • [48] Dissecting Deep Learning NetworksVisualizing Mutual Information
    Fang, Hui
    Wang, Victoria
    Yamaguchi, Motonori
    ENTROPY, 2018, 20 (11)
  • [49] Mutual Information Regularized Offline Reinforcement Learning
    Ma, Xiao
    Kang, Bingyi
    Xu, Zhongwen
    Lin, Min
    Yan, Shuicheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [50] Learning from examples with quadratic mutual information
    Xu, DX
    Principe, JC
    NEURAL NETWORKS FOR SIGNAL PROCESSING VIII, 1998, : 155 - 164