Recursive Adaptation of Stepsize Parameter for Non-stationary Environments

被引:0
|
作者
Noda, Itsuki [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, ITRI, Tsukuba, Ibaraki, Japan
来源
ADAPTIVE AND LEARNING AGENTS | 2010年 / 5924卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose a method to adapt stepsize parameters used in reinforcement learning for non-stationary environments. In general reinforcement learning situations, a stepsize parameter is decreased to zero during learning, because the environment is generally supposed to be noisy but stationary, such that the true expected rewards are fixed. On the other hand, we assume that in the real world, the true expected reward changes over time and hence, the learning agent must adapt the change through continuous learning. We derive the higher-order derivatives of exponential moving average (which is used to estimate the expected values of states or actions in major reinforcement learning methods) using stepsize parameters. We also illustrate a mechanism to calculate these derivatives in a recursive manner. Using the mechanism, we construct a precise and flexible adaptation method for the stepsize parameter in order to optimize a certain criterion, for example, to minimize square errors. The proposed method is validated both theoretically and experimentally.
引用
收藏
页码:74 / 90
页数:17
相关论文
共 50 条
  • [1] Recursive Adaptation of Stepsize Parameter for Non-stationary Environments
    Noda, Itsuki
    [J]. PRINCIPLES OF PRACTICE IN MULTI-AGENT SYSTEMS, 2009, 5925 : 525 - 533
  • [2] Evolutionary adaptation in non-stationary environments: A case study
    Obuchowicz, Andrzej
    Wawrzyniak, Dariusz
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2006, 3911 : 439 - 446
  • [3] Meta-learning optimal parameter values in non-stationary environments
    Sikora, Riyaz T.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2008, 21 (08) : 800 - 806
  • [5] Detection and estimation in non-stationary environments
    Toolan, TM
    Tufts, DW
    [J]. CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 797 - 801
  • [6] Adaptive beamforming in non-stationary environments
    Cox, H
    [J]. THIRTY-SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS - CONFERENCE RECORD, VOLS 1 AND 2, CONFERENCE RECORD, 2002, : 431 - 438
  • [7] Social Learning in non-stationary environments
    Boursier, Etienne
    Perchet, Vianney
    Scarsini, Marco
    [J]. INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 167, 2022, 167
  • [8] Rewiring Neurons in Non-Stationary Environments
    Sun, Zhicheng
    Mu, Yadong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] FLOODING RISK ASSESSMENT IN STATIONARY AND NON-STATIONARY ENVIRONMENTS
    Thomson, Rhys
    Drynan, Leo
    Ball, James
    Veldema, Ailsa
    Phillips, Brett
    Babister, Mark
    [J]. PROCEEDINGS OF THE 36TH IAHR WORLD CONGRESS: DELTAS OF THE FUTURE AND WHAT HAPPENS UPSTREAM, 2015, : 5167 - 5177
  • [10] The recursive maximum likelihood algorithm for non-stationary signals
    Debes, Christian
    Zoubir, Abdelhak M.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3777 - 3780