An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning

被引：10

作者：

Jose, Sharu Theresa ^{[1
]}

Simeone, Osvaldo ^{[1
]}

机构：

[1] Kings Coll London, Dept Engn, Kings Commun Learning & Informat Proc KCLIP Lab, London, England

来源：

2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT) | 2021年

基金：

欧洲研究理事会;

关键词：

D O I：

10.1109/ISIT45174.2021.9517767

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Meta-learning aims at optimizing the hyperparameters of a model class or training algorithm from the observation of data from a number of related tasks. Following the setting of Baxter [1], the tasks are assumed to belong to the same task environment, which is defined by a distribution over the space of tasks and by per-task data distributions. The statistical properties of the task environment thus dictate the similarity of the tasks. The goal of the meta-learner is to ensure that the hyperparameters obtain a small loss when applied for training of a new task sampled from the task environment. The difference between the resulting average loss, known as meta-population loss, and the corresponding empirical lass measured on the available data from related tasks, known as meta-generalization gap, is a measure of the generalization capability of the meta-learner. In this paper, we present novel information-theoretic bounds on the average absolute value of the meta-generalization gap. Unlike prior work [2], our bounds explicitly capture the impact of task relatedness, the number of tasks, and the number of data samples per task on the meta-generalization gap. Task similarity is gauged via the Kullback-Leibler (KL) and Jensen-Shannon (JS) divergences. We illustrate the proposed bounds on the example of ridge regression with meta-learned bias.

引用

页码：1534 / 1539

页数：6

共 50 条

[31] An Information-Theoretic Framework for Deep Learning
Jeon, Hong Jun
Van Roy, Benjamin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[32] Information-theoretic approach to interactive learning
Still, S.
EPL, 2009, 85 (02)
[33] Mathematical Analysis on Information-Theoretic Metric Learning With Application to Supervised Learning
Choi, Jooyeon
Min, Chohong
Lee, Byungjoon
IEEE ACCESS, 2019, 7 : 121998 - 122005
[34] Transfer Learning for Quantum Classifiers: An Information-Theoretic Generalization Analysis
Jose, Sharu Theresa
Simeone, Osvaldo
2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 532 - 537
[35] An information-theoretic approach to spectral variability, similarity, and discrimination for hyperspectral image analysis
Chang, CI
IEEE TRANSACTIONS ON INFORMATION THEORY, 2000, 46 (05) : 1927 - 1932
[36] An Information-Theoretic Machine Learning Approach to Expression QTL Analysis
Huang, Tao
Cai, Yu-Dong
PLOS ONE, 2013, 8 (06):
[37] Using information-theoretic approaches for model selection in meta-analysis
Cinar, Ozan
Umbanhowar, James
Hoeksema, Jason D.
Viechtbauer, Wolfgang
RESEARCH SYNTHESIS METHODS, 2021, 12 (04) : 537 - 556
[38] Unifying cost and information in information-theoretic competitive learning
Kamimura, R
NEURAL NETWORKS, 2005, 18 (5-6) : 711 - 718
[39] Impact of Information on Network Performance - An Information-Theoretic Perspective
Hong, Jun
Li, Victor O. K.
GLOBECOM 2009 - 2009 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, VOLS 1-8, 2009, : 3540 - 3545
[40] Forced information and information loss in information-theoretic competitive learning
Kamimura, Ryotaro
PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND APPLICATIONS, 2007, : 69 - 74

← 1 2 3 4 5 →