Towards a rigorous analysis of mutual information in contrastive learning

被引:0
|
作者
Lee, Kyungeun [1 ,4 ]
Kim, Jaeill [1 ]
Kang, Suhyun [1 ]
Rhee, Wonjong [1 ,2 ,3 ]
机构
[1] Seoul Natl Univ, Dept Intelligence & Informat, 1 Gwanak Ro, Seoul 08826, South Korea
[2] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, 1 Gwanak Ro, Seoul 08826, South Korea
[3] Seoul Natl Univ, AI Inst, 1 Gwanak Ro, Seoul 08826, South Korea
[4] LG AI Res, 150 Magokjungang Ro, Seoul 07789, South Korea
基金
新加坡国家研究基金会;
关键词
Representation learning; Contrastive learning; Mutual information; Unsupervised learning;
D O I
10.1016/j.neunet.2024.106584
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive learning has emerged as a cornerstone in unsupervised representation learning. Its primary paradigm involves an instance discrimination task utilizing InfoNCE loss where the loss has been proven to be a form of mutual information. Consequently, it has become a common practice to analyze contrastive learning using mutual information as a measure. Yet, this analysis approach presents difficulties due to the necessity of estimating mutual information for real-world applications. This creates a gap between the elegance of its mathematical foundation and the complexity of its estimation, thereby hampering the ability to derive solid and meaningful insights from mutual information analysis. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating the capacity of the proposed methods to facilitate deeper comprehension or to rectify pre-existing misconceptions. The main results can be summarized as follows: (1) While small batch sizes influence the range of training loss, they do not inherently limit learned representation's information content or affect downstream performance adversely; (2) Mutual information, with careful selection of positive pairings and post-training estimation, proves to be a superior measure for evaluating practical networks; and (3) Distinguishing between task-relevant and irrelevant information presents challenges, yet irrelevant information sources do not necessarily compromise the generalization of downstream tasks.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
    Cheng, Pengyu
    Hao, Weituo
    Dai, Shuyang
    Liu, Jiachang
    Gan, Zhe
    Carin, Lawrence
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [22] CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
    Cheng, Pengyu
    Hao, Weituo
    Dai, Shuyang
    Liu, Jiachang
    Gan, Zhe
    Carin, Lawrence
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [23] Hypergraph contrastive learning for recommendation with side information
    Ao, Dun
    Cao, Qian
    Wang, Xiaofeng
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2024,
  • [25] Adversarial Graph Contrastive Learning with Information Regularization
    Feng, Shengyu
    Jing, Baoyu
    Zhu, Yada
    Tong, Hanghang
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 1362 - 1371
  • [26] JointContrast: Skeleton-Based Mutual Action Recognition with Contrastive Learning
    Jia, Xiangze
    Zhang, Ji
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Xiao, Jing
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 478 - 489
  • [27] Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition
    Yang, Chuanguang
    An, Zhulin
    Zhou, Helong
    Zhuang, Fuzhen
    Xu, Yongjun
    Zhang, Qian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (08) : 10212 - 10227
  • [28] Mutual mentor: Online contrastive distillation network for general continual learning
    Wang, Qiang
    Ji, Zhong
    Li, Jin
    Pang, Yanwei
    NEUROCOMPUTING, 2023, 537 : 37 - 48
  • [29] DCML: Deep contrastive mutual learning for COVID-19 recognition
    Zhang, Hongbin
    Liang, Weinan
    Li, Chuanxiu
    Xiong, Qipeng
    Shi, Haowei
    Hu, Lang
    Li, Guangli
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2022, 77
  • [30] Causal Analysis of Learning Performance Based on Bayesian Network and Mutual Information
    Chen, Jing
    Feng, Jun
    Hu, Jingzhao
    Sun, Xia
    ENTROPY, 2019, 21 (11)