An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning

被引：10

作者：

Jose, Sharu Theresa ^{[1
]}

Simeone, Osvaldo ^{[1
]}

机构：

[1] Kings Coll London, Dept Engn, Kings Commun Learning & Informat Proc KCLIP Lab, London, England

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT) | 2021年

基金：

欧洲研究理事会;

关键词：

D O I：

10.1109/ISIT45174.2021.9517767

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Meta-learning aims at optimizing the hyperparameters of a model class or training algorithm from the observation of data from a number of related tasks. Following the setting of Baxter [1], the tasks are assumed to belong to the same task environment, which is defined by a distribution over the space of tasks and by per-task data distributions. The statistical properties of the task environment thus dictate the similarity of the tasks. The goal of the meta-learner is to ensure that the hyperparameters obtain a small loss when applied for training of a new task sampled from the task environment. The difference between the resulting average loss, known as meta-population loss, and the corresponding empirical lass measured on the available data from related tasks, known as meta-generalization gap, is a measure of the generalization capability of the meta-learner. In this paper, we present novel information-theoretic bounds on the average absolute value of the meta-generalization gap. Unlike prior work [2], our bounds explicitly capture the impact of task relatedness, the number of tasks, and the number of data samples per task on the meta-generalization gap. Task similarity is gauged via the Kullback-Leibler (KL) and Jensen-Shannon (JS) divergences. We illustrate the proposed bounds on the example of ridge regression with meta-learned bias.

引用

页码：1534 / 1539

页数：6

共 50 条

[21] Transfer Meta-Learning: Information- Theoretic Bounds and Information Meta-Risk Minimization
Jose, Sharu Theresa
Simeone, Osvaldo
Durisi, Giuseppe
IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (01) : 474 - 501
[22] Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Furuta, Hiroki
Matsushima, Tatsuya
Kozuno, Tadashi
Matsuo, Yutaka
Levine, Sergey
Nachum, Ofir
Gu, Shixiang Shane
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[23] Information-theoretic analysis of information hiding
Moulin, P
O'Sullivan, JA
2000 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2000, : 19 - 19
[24] Information-theoretic analysis of information hiding
Moulin, P
O'Sullivan, JA
IEEE TRANSACTIONS ON INFORMATION THEORY, 2003, 49 (03) : 563 - 593
[25] An information-theoretic analysis of resting-state versus task fMRI
Tuominen, Julia
Specht, Karsten
Vaisvilaite, Liucija
Zeidman, Peter
NETWORK NEUROSCIENCE, 2023, 7 (02) : 769 - 786
[26] Information-theoretic analysis of watermarking
Moulin, P
O'Sullivan, JA
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 3630 - 3633
[27] Reinforcement Learning with Information-Theoretic Actuation
Catt, Elliot
Hutter, Marcus
Veness, Joel
ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, 2023, 13539 : 188 - 198
[28] An Information-Theoretic Analysis of Deduplication
Niesen, Urs
IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (09) : 5688 - 5704
[29] An Information-Theoretic Analysis of Deduplication
Niesen, Urs
2017 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2017, : 1738 - 1742
[30] An Enriched Information-Theoretic Definition of Semantic Similarity in a Taxonomy
Formica, Anna
Taglino, Francesco
IEEE ACCESS, 2021, 9 : 100583 - 100593

← 1 2 3 4 5 →