Towards a rigorous analysis of mutual information in contrastive learning

被引:0
|
作者
Lee, Kyungeun [1 ,4 ]
Kim, Jaeill [1 ]
Kang, Suhyun [1 ]
Rhee, Wonjong [1 ,2 ,3 ]
机构
[1] Seoul Natl Univ, Dept Intelligence & Informat, 1 Gwanak Ro, Seoul 08826, South Korea
[2] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, 1 Gwanak Ro, Seoul 08826, South Korea
[3] Seoul Natl Univ, AI Inst, 1 Gwanak Ro, Seoul 08826, South Korea
[4] LG AI Res, 150 Magokjungang Ro, Seoul 07789, South Korea
基金
新加坡国家研究基金会;
关键词
Representation learning; Contrastive learning; Mutual information; Unsupervised learning;
D O I
10.1016/j.neunet.2024.106584
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive learning has emerged as a cornerstone in unsupervised representation learning. Its primary paradigm involves an instance discrimination task utilizing InfoNCE loss where the loss has been proven to be a form of mutual information. Consequently, it has become a common practice to analyze contrastive learning using mutual information as a measure. Yet, this analysis approach presents difficulties due to the necessity of estimating mutual information for real-world applications. This creates a gap between the elegance of its mathematical foundation and the complexity of its estimation, thereby hampering the ability to derive solid and meaningful insights from mutual information analysis. In this study, we introduce three novel methods and a few related theorems, aimed at enhancing the rigor of mutual information analysis. Despite their simplicity, these methods can carry substantial utility. Leveraging these approaches, we reassess three instances of contrastive learning analysis, illustrating the capacity of the proposed methods to facilitate deeper comprehension or to rectify pre-existing misconceptions. The main results can be summarized as follows: (1) While small batch sizes influence the range of training loss, they do not inherently limit learned representation's information content or affect downstream performance adversely; (2) Mutual information, with careful selection of positive pairings and post-training estimation, proves to be a superior measure for evaluating practical networks; and (3) Distinguishing between task-relevant and irrelevant information presents challenges, yet irrelevant information sources do not necessarily compromise the generalization of downstream tasks.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Decomposed Mutual Information Estimation for Contrastive Representation Learning
    Sordoni, Alessandro
    Dziri, Nouha
    Schulz, Hannes
    Gordon, Geoff
    Bachman, Phil
    Tachet, Remi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [2] Graph contrastive learning with min-max mutual information
    Xu, Yuhua
    Wang, Junli
    Guang, Mingjian
    Yan, Chungang
    Jiang, Changjun
    INFORMATION SCIENCES, 2024, 665
  • [3] Interdependence-Adaptive Mutual Information Maximization for Graph Contrastive Learning
    Sun Q.
    Wang K.
    Zhang W.
    Cheng P.
    Lin X.
    IEEE Transactions on Knowledge and Data Engineering, 2024, 36 (12) : 1 - 12
  • [4] Mutual Contrastive Learning for Visual Representation Learning
    Yang, Chuanguang
    An, Zhulin
    Cai, Linhang
    Xu, Yongjun
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3045 - 3053
  • [5] Towards generalizable Graph Contrastive Learning: An information theory perspective
    Yuan, Yige
    Xu, Bingbing
    Shen, Huawei
    Cao, Qi
    Cen, Keting
    Zheng, Wen
    Cheng, Xueqi
    NEURAL NETWORKS, 2024, 172
  • [6] miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings
    Klein, Tassilo
    Nabi, Moin
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 6159 - 6177
  • [7] CROSS-DOMAIN SENTIMENT CLASSIFICATION WITH CONTRASTIVE LEARNING AND MUTUAL INFORMATION MAXIMIZATION
    Li, Tian
    Chen, Xiang
    Zhang, Shanghang
    Dong, Zhen
    Keutzer, Kurt
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8203 - 8207
  • [8] Mutual Information Driven Equivariant Contrastive Learning for 3D Action Representation Learning
    Lin, Lilang
    Zhang, Jiahang
    Liu, Jiaying
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1883 - 1897
  • [9] Towards Robust False Information Detection on Social Networks with Contrastive Learning
    Ma, Guanghui
    Hu, Chunming
    Ge, Ling
    Chen, Junfan
    Zhang, Hong
    Zhang, Richong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 1441 - 1450
  • [10] Smoothed noise contrastive mutual information neural estimation
    Wang, Xu
    Al-Bashabsheh, Ali
    Zhao, Chao
    Chan, Chung
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2023, 360 (16): : 12415 - 12435