An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning

被引：10

作者：

Jose, Sharu Theresa ^{[1
]}

Simeone, Osvaldo ^{[1
]}

机构：

[1] Kings Coll London, Dept Engn, Kings Commun Learning & Informat Proc KCLIP Lab, London, England

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT) | 2021年

基金：

欧洲研究理事会;

关键词：

D O I：

10.1109/ISIT45174.2021.9517767

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Meta-learning aims at optimizing the hyperparameters of a model class or training algorithm from the observation of data from a number of related tasks. Following the setting of Baxter [1], the tasks are assumed to belong to the same task environment, which is defined by a distribution over the space of tasks and by per-task data distributions. The statistical properties of the task environment thus dictate the similarity of the tasks. The goal of the meta-learner is to ensure that the hyperparameters obtain a small loss when applied for training of a new task sampled from the task environment. The difference between the resulting average loss, known as meta-population loss, and the corresponding empirical lass measured on the available data from related tasks, known as meta-generalization gap, is a measure of the generalization capability of the meta-learner. In this paper, we present novel information-theoretic bounds on the average absolute value of the meta-generalization gap. Unlike prior work [2], our bounds explicitly capture the impact of task relatedness, the number of tasks, and the number of data samples per task on the meta-generalization gap. Task similarity is gauged via the Kullback-Leibler (KL) and Jensen-Shannon (JS) divergences. We illustrate the proposed bounds on the example of ridge regression with meta-learned bias.

引用

页码：1534 / 1539

页数：6

共 50 条

[11] On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning
Wang, Hao
Ustun, Berk
Calmon, Flavio P.
2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 1216 - 1220
[12] An Information-Theoretic Analysis of Bayesian Reinforcement Learning
Gouverneur, Amaury
Rodriguez-Galvez, Borja
Oechtering, Tobias J.
Skoglund, Mikael
2022 58TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2022,
[13] On the Generalization for Transfer Learning: An Information-Theoretic Analysis
Wu, Xuetong
Manton, Jonathan H.
Aickelin, Uwe
Zhu, Jingge
IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (10) : 7089 - 7124
[14] Information-Theoretic Odometry Learning
Sen Zhang
Jing Zhang
Dacheng Tao
International Journal of Computer Vision, 2022, 130 : 2553 - 2570
[15] Information-theoretic competitive learning
Kamimura, R
IASTED: PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON MODELLING AND SIMULATION, 2003, : 359 - 365
[16] Information-Theoretic Odometry Learning
Zhang, Sen
Zhang, Jing
Tao, Dacheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (11) : 2553 - 2570
[17] Information-Theoretic Multi-task Learning Framework for Bayesian Optimisation
Ramachandran, Anil
Gupta, Sunil
Rana, Santu
Venkatesh, Svetha
AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 497 - 509
[18] An information-theoretic analysis of return maximization in reinforcement learning
Iwata, Kazunori
NEURAL NETWORKS, 2011, 24 (10) : 1074 - 1081
[19] Information-theoretic analysis of generalization capability of learning algorithms
Xu, Aolin
Raginsky, Maxim
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[20] Information-Theoretic Analysis of Stability and Bias of Learning Algorithms
Raginsky, Maxim
Rakhlin, Alexander
Tsao, Matthew
Wu, Yihong
Xu, Aolin
2016 IEEE INFORMATION THEORY WORKSHOP (ITW), 2016,

← 1 2 3 4 5 →