Learning with incomplete information and the mathematical structure behind it

被引:0
|
作者
Kuehn, Reimer
Stamatescu, Ion-Olimpiu [1 ]
机构
[1] Kings Coll London, Dept Math, London WC2R 2LS, England
[2] Univ Heidelberg, FESt, D-6900 Heidelberg, Germany
[3] Univ Heidelberg, Inst Theoret Phys, D-6900 Heidelberg, Germany
关键词
D O I
10.1007/s00422-007-0162-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We investigate the problem of learning with incomplete information as exemplified by learning with delayed reinforcement. We study a two phase learning scenario in which a phase of Hebbian associative learning based on momentary internal representations is supplemented by an 'unlearning' phase depending on a graded reinforcement signal. The reinforcement signal quantifies the success-rate globally for a number of learning steps in phase one, and 'unlearning' is indiscriminate with respect to associations learnt in that phase. Learning according to this model is studied via simulations and analytically within a student-teacher scenario for both single layer networks and, for a committee machine. Success and speed of learning depend on the ratio lambda of the learning rates used for the associative Hebbian learning phase and for the unlearning-correction in response to the reinforcement signal, respectively. Asymptotically perfect generalization is possible only, if this ratio exceeds a critical value lambda (c) , in which case the generalization error exhibits a power law decay with the number of examples seen by the student, with an exponent that depends in a non-universal manner on the parameter lambda. We find these features to be robust against a wide spectrum of modifications of microscopic modelling details. Two illustrative applications-one of a robot learning to navigate a field containing obstacles, and the problem of identifying a specific component in a collection of stimuli-are also provided.
引用
收藏
页码:99 / 112
页数:14
相关论文
共 50 条
  • [31] Learning in multilevel games with incomplete information - Part I
    Billard, E
    Lakshmivarahan, S
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1999, 29 (03): : 329 - 339
  • [32] Learning and informational stability of dynamic REE with incomplete information
    Rondina, Giacomo
    Walker, Todd B.
    REVIEW OF ECONOMIC DYNAMICS, 2016, 21 : 147 - 159
  • [33] Structure Feature Learning Method for Incomplete Data
    Zhou, Xiabing
    Xing, Xingxing
    Han, Lei
    Hong, Haikun
    Bian, Kaigui
    Xie, Kunqing
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (09)
  • [34] Valuation structure in incomplete information contests: experimental evidence
    Diego Aycinena
    Rimvydas Baltaduonis
    Lucas Rentschler
    Public Choice, 2019, 179 : 195 - 208
  • [35] Valuation structure in incomplete information contests: experimental evidence
    Aycinena, Diego
    Baltaduonis, Rimvydas
    Rentschler, Lucas
    PUBLIC CHOICE, 2019, 179 (3-4) : 195 - 208
  • [36] Social and Cognitive System for Learning Negotiation Strategies with Incomplete Information
    Chohra, Amine
    Bahrammirzaee, Arash
    Madani, Kurosh
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 610 - 618
  • [37] MINDTL: Multiple Incomplete Domains Transfer Learning for Information Recommendation
    He, Ming
    Zhang, Jiuling
    Zhang, Jiang
    CHINA COMMUNICATIONS, 2017, 14 (11) : 218 - 236
  • [38] MINDTL: Multiple Incomplete Domains Transfer Learning for Information Recommendation
    Ming He
    Jiuling Zhang
    Jiang Zhang
    中国通信, 2017, 14 (11) : 218 - 236
  • [39] Online Learning for Equilibrium Pricing in Markets under Incomplete Information
    Jalota, Devansh
    Sun, Haoyuan
    Azizan, Navid
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 4996 - 5001
  • [40] Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information
    Wang, Huiwei
    Huang, Tingwen
    Liao, Xiaofeng
    Abu-Rub, Haitham
    Chen, Guo
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) : 3404 - 3416