Information-theoretic analysis for transfer learning

被引:0
|
作者
Wu, Xuetong [1 ]
Manton, Jonathan H. [1 ]
Aickelin, Uwe [2 ]
Zhu, Jingge [1 ]
机构
[1] Univ Melbourne, Dept Elect & Elect Engn, Parkville, Vic, Australia
[2] Univ Melbourne, Dept Comp & Informat Syst, Parkville, Vic, Australia
关键词
D O I
10.1109/isit44484.2020.9173989
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Transfer learning, or domain adaptation, is concerned with machine learning problems in which training and testing data come from possibly different distributions (denoted as mu and mu', respectively). In this work, we give an information-theoretic analysis on the generalization error and the excess risk of transfer learning algorithms, following a line of work initiated by Russo and Zhou. Our results suggest, perhaps as expected, that the Kullback-Leibler (KL) divergence D(mu parallel to mu') plays an important role in characterizing the generalization error in the settings of domain adaptation. Specifically, we provide generalization error upper bounds for general transfer learning algorithms, and extend the results to a specific empirical risk minimization (ERM) algorithm where data from both distributions are available in the training phase. We further apply the method to iterative, noisy gradient descent algorithms, and obtain upper bounds which can be easily calculated, only using parameters from the learning algorithms. A few illustrative examples are provided to demonstrate the usefulness of the results. In particular, our bound is tighter in specific classification problems than the bound derived using Rademacher complexity.
引用
收藏
页码:2819 / 2824
页数:6
相关论文
共 50 条
  • [41] Information-theoretic learning for FAN network applied to eterokurtic component analysis
    Fiori, S
    [J]. IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 2002, 149 (06): : 347 - 354
  • [42] On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning
    Wang, Hao
    Ustun, Berk
    Calmon, Flavio P.
    [J]. 2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 1216 - 1220
  • [43] Information-Theoretic Analysis of Epistemic Uncertainty in Bayesian Meta-learning
    Jose, Sharu Theresa
    Park, Sangwoo
    Simeone, Osvaldo
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [44] An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning
    Jose, Sharu Theresa
    Simeone, Osvaldo
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 1534 - 1539
  • [45] Improving information-theoretic competitive learning by accentuated information maximization
    Kamimura, R
    [J]. INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2005, 34 (03) : 219 - 233
  • [46] Information maximization and cost minimization in information-theoretic competitive learning
    Kamimura, R
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 202 - 207
  • [47] Forced information maximization to accelerate information-theoretic competitive learning
    Karnimura, Ryotaro
    Kitajima, Ryozo
    [J]. 2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1779 - 1784
  • [48] Information-Theoretic Analysis for the Difficulty of Extracting Hidden Information
    ZHANG Wei-ming 1
    2.State Key Laboratory of Information Security
    [J]. Wuhan University Journal of Natural Sciences, 2005, (01) : 315 - 318
  • [49] Information-Theoretic Analysis of Minimax Excess Risk
    Hafez-Kolahi, Hassan
    Moniri, Behrad
    Kasaei, Shohreh
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (07) : 4659 - 4674
  • [50] An Information-Theoretic Analysis of Distributed Resource Allocation
    Alpcan, Tansu
    Dey, Subhrakanti
    [J]. 2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 7327 - 7332