Recursive Adaptation of Stepsize Parameter for Non-stationary Environments

被引:0
|
作者
Noda, Itsuki [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, ITRI, Tsukuba, Ibaraki, Japan
来源
ADAPTIVE AND LEARNING AGENTS | 2010年 / 5924卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose a method to adapt stepsize parameters used in reinforcement learning for non-stationary environments. In general reinforcement learning situations, a stepsize parameter is decreased to zero during learning, because the environment is generally supposed to be noisy but stationary, such that the true expected rewards are fixed. On the other hand, we assume that in the real world, the true expected reward changes over time and hence, the learning agent must adapt the change through continuous learning. We derive the higher-order derivatives of exponential moving average (which is used to estimate the expected values of states or actions in major reinforcement learning methods) using stepsize parameters. We also illustrate a mechanism to calculate these derivatives in a recursive manner. Using the mechanism, we construct a precise and flexible adaptation method for the stepsize parameter in order to optimize a certain criterion, for example, to minimize square errors. The proposed method is validated both theoretically and experimentally.
引用
收藏
页码:74 / 90
页数:17
相关论文
共 50 条
  • [31] Dynamic Adaptation on Non-stationary Visual Domains
    Shkodrani, Sindi
    Hofmann, Michael
    Gavves, Efstratios
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 158 - 171
  • [32] Supporting Sensor Orchestration in Non-Stationary Environments
    Holst, Christoph-Alexander
    Lohweg, Volker
    [J]. 2018 ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS, 2018, : 363 - 370
  • [33] Factored Adaptation for Non-stationary Reinforcement Learning
    Feng, Fan
    Huang, Biwei
    Zhang, Kun
    Magliacane, Sara
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [34] THE COMBINING OF FORECASTS USING RECURSIVE TECHNIQUES WITH NON-STATIONARY WEIGHTS
    SESSIONS, DN
    CHATTERJEE, S
    [J]. JOURNAL OF FORECASTING, 1989, 8 (03) : 239 - 251
  • [35] A Unified Channel Estimation Framework for Stationary and Non-Stationary Fading Environments
    Shi, Qi
    Liu, Yangyu
    Zhang, Shunqing
    Xu, Shugong
    Lau, Vincent K. N.
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2021, 69 (07) : 4937 - 4952
  • [36] Fundamental Limits of Age-of-Information in Stationary and Non-stationary Environments
    Banerjee, Subhankar
    Bhattacharjee, Rajarshi
    Sinha, Abhishek
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2020, : 1741 - 1746
  • [37] Sub-structural niching in non-stationary environments
    Sastry, T
    Abbass, LA
    Goldberg, T
    [J]. AI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3339 : 873 - 885
  • [38] A robust incremental learning method for non-stationary environments
    Martinez-Rego, David
    Perez-Sanchez, Beatriz
    Fontenla-Romero, Oscar
    Alonso-Betanzos, Amparo
    [J]. NEUROCOMPUTING, 2011, 74 (11) : 1800 - 1808
  • [39] Reinforcement learning in episodic non-stationary Markovian environments
    Choi, SPM
    Zhang, NL
    Yeung, DY
    [J]. IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 752 - 758
  • [40] A heterogeneous online learning ensemble for non-stationary environments
    Idrees, Mobin M.
    Minku, Leandro L.
    Stahl, Frederic
    Badii, Atta
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 188