Information-theoretic analysis for transfer learning

被引:0
|
作者
Wu, Xuetong [1 ]
Manton, Jonathan H. [1 ]
Aickelin, Uwe [2 ]
Zhu, Jingge [1 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Parkville, Vic, Australia
[2] Univ Melbourne, Dept Comp & Informat Syst, Parkville, Vic, Australia
关键词
D O I
10.1109/isit44484.2020.9173989
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Transfer learning, or domain adaptation, is concerned with machine learning problems in which training and testing data come from possibly different distributions (denoted as mu and mu', respectively). In this work, we give an information-theoretic analysis on the generalization error and the excess risk of transfer learning algorithms, following a line of work initiated by Russo and Zhou. Our results suggest, perhaps as expected, that the Kullback-Leibler (KL) divergence D(mu parallel to mu') plays an important role in characterizing the generalization error in the settings of domain adaptation. Specifically, we provide generalization error upper bounds for general transfer learning algorithms, and extend the results to a specific empirical risk minimization (ERM) algorithm where data from both distributions are available in the training phase. We further apply the method to iterative, noisy gradient descent algorithms, and obtain upper bounds which can be easily calculated, only using parameters from the learning algorithms. A few illustrative examples are provided to demonstrate the usefulness of the results. In particular, our bound is tighter in specific classification problems than the bound derived using Rademacher complexity.
引用
收藏
页码:2819 / 2824
页数:6
相关论文
共 50 条
  • [1] On the Generalization for Transfer Learning: An Information-Theoretic Analysis
    Wu, Xuetong
    Manton, Jonathan H.
    Aickelin, Uwe
    Zhu, Jingge
    [J]. IEEE Transactions on Information Theory, 2024, 70 (10) : 7089 - 7124
  • [2] Transfer Learning for Quantum Classifiers: An Information-Theoretic Generalization Analysis
    Jose, Sharu Theresa
    Simeone, Osvaldo
    [J]. 2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 532 - 537
  • [3] Information-Theoretic Transfer Learning Framework for Bayesian Optimisation
    Ramachandran, Anil
    Gupta, Sunil
    Rana, Santu
    Venkatesh, Svetha
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 827 - 842
  • [4] AN INFORMATION-THEORETIC APPROACH TO TRANSFERABILITY IN TASK TRANSFER LEARNING
    Bao, Yajie
    Li, Yang
    Huang, Shao-Lun
    Zhang, Lin
    Zheng, Lizhong
    Zamir, Amir
    Guibas, Leonidas
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2309 - 2313
  • [5] An Information-Theoretic Analysis of Bayesian Reinforcement Learning
    Gouverneur, Amaury
    Rodriguez-Galvez, Borja
    Oechtering, Tobias J.
    Skoglund, Mikael
    [J]. 2022 58TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2022,
  • [6] Information-Theoretic Odometry Learning
    Sen Zhang
    Jing Zhang
    Dacheng Tao
    [J]. International Journal of Computer Vision, 2022, 130 : 2553 - 2570
  • [7] Information-theoretic competitive learning
    Kamimura, R
    [J]. IASTED: PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON MODELLING AND SIMULATION, 2003, : 359 - 365
  • [8] Information-Theoretic Odometry Learning
    Zhang, Sen
    Zhang, Jing
    Tao, Dacheng
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (11) : 2553 - 2570
  • [9] An information-theoretic analysis of return maximization in reinforcement learning
    Iwata, Kazunori
    [J]. NEURAL NETWORKS, 2011, 24 (10) : 1074 - 1081
  • [10] Information-theoretic analysis of generalization capability of learning algorithms
    Xu, Aolin
    Raginsky, Maxim
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30