Recursive Adaptation of Stepsize Parameter for Non-stationary Environments

被引:0
|
作者
Noda, Itsuki [1 ]
机构
[1] Natl Inst Adv Ind Sci & Technol, ITRI, Tsukuba, Ibaraki, Japan
来源
ADAPTIVE AND LEARNING AGENTS | 2010年 / 5924卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose a method to adapt stepsize parameters used in reinforcement learning for non-stationary environments. In general reinforcement learning situations, a stepsize parameter is decreased to zero during learning, because the environment is generally supposed to be noisy but stationary, such that the true expected rewards are fixed. On the other hand, we assume that in the real world, the true expected reward changes over time and hence, the learning agent must adapt the change through continuous learning. We derive the higher-order derivatives of exponential moving average (which is used to estimate the expected values of states or actions in major reinforcement learning methods) using stepsize parameters. We also illustrate a mechanism to calculate these derivatives in a recursive manner. Using the mechanism, we construct a precise and flexible adaptation method for the stepsize parameter in order to optimize a certain criterion, for example, to minimize square errors. The proposed method is validated both theoretically and experimentally.
引用
收藏
页码:74 / 90
页数:17
相关论文
共 50 条
  • [41] Reinforcement learning in episodic non-stationary Markovian environments
    Choi, SPM
    Zhang, NL
    Yeung, DY
    [J]. IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 752 - 758
  • [42] Learning spectrum opportunities in non-stationary radio environments
    Oksanen, Jan
    Koivunen, Visa
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2447 - 2451
  • [43] Stochastic Bandits with Graph Feedback in Non-Stationary Environments
    Lu, Shiyin
    Hu, Yao
    Zhang, Lijun
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8758 - 8766
  • [44] Minority games and distributed coordination in non-stationary environments
    Galstyan, A
    Lerman, K
    [J]. PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 2610 - 2614
  • [45] A Comparison of Adaptation Techniques for the Solution of Non-stationary Flow
    Felcman, J.
    Kubera, P.
    [J]. NUMERICAL ANALYSIS AND APPLIED MATHEMATICS, 2008, 1048 : 835 - 838
  • [46] Adaptive deep reinforcement learning for non-stationary environments
    Zhu, Jin
    Wei, Yutong
    Kang, Yu
    Jiang, Xiaofeng
    Dullerud, Geir E.
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (10)
  • [47] Tracking the Best Expert in Non-stationary Stochastic Environments
    Wei, Chen-Yu
    Hong, Yi-Te
    Lu, Chi-Jen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [48] Adaptive deep reinforcement learning for non-stationary environments
    Jin Zhu
    Yutong Wei
    Yu Kang
    Xiaofeng Jiang
    Geir E. Dullerud
    [J]. Science China Information Sciences, 2022, 65
  • [49] Stochastic Bandits with Graph Feedback in Non-Stationary Environments
    National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing
    210023, China
    不详
    100102, China
    [J]. AAAI Conf. Artif. Intell., AAAI, 1600, (8758-8766): : 8758 - 8766
  • [50] Weighted Gaussian Process Bandits for Non-stationary Environments
    Deng, Yuntian
    Zhou, Xingyu
    Kim, Baekjin
    Tewari, Ambuj
    Gupta, Abhishek
    Shroff, Ness
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151