Information-theoretic analysis of generalization capability of learning algorithms

被引:0
|
作者
Xu, Aolin [1 ,2 ]
Raginsky, Maxim [1 ,2 ]
机构
[1] Univ Illinois, Dept Elect & Comp Engn, 1406 W Green St, Urbana, IL 61801 USA
[2] Univ Illinois, Coordinated Sci Lab, 1101 W Springfield Ave, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
STABILITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We derive upper bounds on the generalization error of a learning algorithm in terms of the mutual information between its input and output. The bounds provide an information-theoretic understanding of generalization in learning problems, and give theoretical guidelines for striking the right balance between data fit and generalization by controlling the input-output mutual information. We propose a number of methods for this purpose, among which are algorithms that regularize the ERM algorithm with relative entropy or with random noise. Our work extends and leads to nontrivial improvements on the recent results of Russo and Zou.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] On the Generalization for Transfer Learning: An Information-Theoretic Analysis
    Wu, Xuetong
    Manton, Jonathan H.
    Aickelin, Uwe
    Zhu, Jingge
    [J]. IEEE Transactions on Information Theory, 2024, 70 (10) : 7089 - 7124
  • [2] Information-Theoretic Bounds on the Moments of the Generalization Error of Learning Algorithms
    Aminian, Gholamali
    Toni, Laura
    Rodrigues, Miguel R. D.
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 682 - 687
  • [3] Information-theoretic generalization bounds for black-box learning algorithms
    Harutyunyan, Hrayr
    Raginsky, Maxim
    Ver Steeg, Greg
    Galstyan, Aram
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Transfer Learning for Quantum Classifiers: An Information-Theoretic Generalization Analysis
    Jose, Sharu Theresa
    Simeone, Osvaldo
    [J]. 2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 532 - 537
  • [5] Information-Theoretic Analysis of Stability and Bias of Learning Algorithms
    Raginsky, Maxim
    Rakhlin, Alexander
    Tsao, Matthew
    Wu, Yihong
    Xu, Aolin
    [J]. 2016 IEEE INFORMATION THEORY WORKSHOP (ITW), 2016,
  • [6] Generalization Bounds for Meta-Learning: An Information-Theoretic Analysis
    Chen, Qi
    Shui, Changjian
    Marchand, Mario
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Information-theoretic Analysis of MAXCUT Algorithms
    Bian, Yatao
    Gronskiy, Alexey
    Buhmann, Joachim M.
    [J]. 2016 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2016,
  • [8] Information-theoretic analysis for transfer learning
    Wu, Xuetong
    Manton, Jonathan H.
    Aickelin, Uwe
    Zhu, Jingge
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2020, : 2819 - 2824
  • [9] Information-Theoretic Generalization Bounds for Meta-Learning and Applications
    Jose, Sharu Theresa
    Simeone, Osvaldo
    [J]. ENTROPY, 2021, 23 (01) : 1 - 28
  • [10] MONOTONICITY MAINTENANCE IN INFORMATION-THEORETIC MACHINE LEARNING ALGORITHMS
    BENDAVID, A
    [J]. MACHINE LEARNING, 1995, 19 (01) : 29 - 43