An Information-Theoretic Analysis of the Impact of Task Similarity on Meta-Learning

被引:10
|
作者
Jose, Sharu Theresa [1 ]
Simeone, Osvaldo [1 ]
机构
[1] Kings Coll London, Dept Engn, Kings Commun Learning & Informat Proc KCLIP Lab, London, England
基金
欧洲研究理事会;
关键词
D O I
10.1109/ISIT45174.2021.9517767
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Meta-learning aims at optimizing the hyperparameters of a model class or training algorithm from the observation of data from a number of related tasks. Following the setting of Baxter [1], the tasks are assumed to belong to the same task environment, which is defined by a distribution over the space of tasks and by per-task data distributions. The statistical properties of the task environment thus dictate the similarity of the tasks. The goal of the meta-learner is to ensure that the hyperparameters obtain a small loss when applied for training of a new task sampled from the task environment. The difference between the resulting average loss, known as meta-population loss, and the corresponding empirical lass measured on the available data from related tasks, known as meta-generalization gap, is a measure of the generalization capability of the meta-learner. In this paper, we present novel information-theoretic bounds on the average absolute value of the meta-generalization gap. Unlike prior work [2], our bounds explicitly capture the impact of task relatedness, the number of tasks, and the number of data samples per task on the meta-generalization gap. Task similarity is gauged via the Kullback-Leibler (KL) and Jensen-Shannon (JS) divergences. We illustrate the proposed bounds on the example of ridge regression with meta-learned bias.
引用
收藏
页码:1534 / 1539
页数:6
相关论文
共 50 条
  • [21] Transfer Meta-Learning: Information- Theoretic Bounds and Information Meta-Risk Minimization
    Jose, Sharu Theresa
    Simeone, Osvaldo
    Durisi, Giuseppe
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (01) : 474 - 501
  • [22] Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
    Furuta, Hiroki
    Matsushima, Tatsuya
    Kozuno, Tadashi
    Matsuo, Yutaka
    Levine, Sergey
    Nachum, Ofir
    Gu, Shixiang Shane
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [23] Information-theoretic analysis of information hiding
    Moulin, P
    O'Sullivan, JA
    2000 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2000, : 19 - 19
  • [24] Information-theoretic analysis of information hiding
    Moulin, P
    O'Sullivan, JA
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2003, 49 (03) : 563 - 593
  • [25] An information-theoretic analysis of resting-state versus task fMRI
    Tuominen, Julia
    Specht, Karsten
    Vaisvilaite, Liucija
    Zeidman, Peter
    NETWORK NEUROSCIENCE, 2023, 7 (02) : 769 - 786
  • [26] Information-theoretic analysis of watermarking
    Moulin, P
    O'Sullivan, JA
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 3630 - 3633
  • [27] Reinforcement Learning with Information-Theoretic Actuation
    Catt, Elliot
    Hutter, Marcus
    Veness, Joel
    ARTIFICIAL GENERAL INTELLIGENCE, AGI 2022, 2023, 13539 : 188 - 198
  • [28] An Information-Theoretic Analysis of Deduplication
    Niesen, Urs
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (09) : 5688 - 5704
  • [29] An Information-Theoretic Analysis of Deduplication
    Niesen, Urs
    2017 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2017, : 1738 - 1742
  • [30] An Enriched Information-Theoretic Definition of Semantic Similarity in a Taxonomy
    Formica, Anna
    Taglino, Francesco
    IEEE ACCESS, 2021, 9 : 100583 - 100593