Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning

被引:1
|
作者
Shokri, Matin [1 ]
Khasteh, Seyed Hossein [2 ]
Aminifar, Amin [1 ]
机构
[1] KN Toosi Univ Technol, Tehran N, Iran
[2] KN Toosi Univ Technol, Comp Engn Dept, Tehran, Iran
关键词
Reinforcement learning; Temporal difference; Watkinss Q; Fuzzy inference; Adaptive fuzzy Watkins (AFW);
D O I
10.1007/s40815-019-00633-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning is one of the most reliable methods, which have been used to solve many problems. One of the best reinforcement learning family methods are temporal difference methods. The most important weakness of reinforcement learning methods, such as temporal difference methods, is that these methods have slow convergence rate. Many studies are devoted to solving this problem. One of the proposed solutions to this problem is eligibility traces. Owing to the nature of off-policy methods, combining eligibility traces with off-policy methods requires special attention. In the early learning process for Watkins method (one of the dominant eligibility traces methods), cutting eligibility traces during exploratory actions results in diminishing benefits of eligibility traces method. In this study, we propose a framework to combine eligibility traces with off-policy methods. This research attempts to properly use the information explored during action exploration of the agent; to this end, the decision about applying the eligibility traces during the exploratory actions of the agent is made by means of fuzzy adaptation. We apply this method to find the goal state in the static and dynamic grid world. We compare our approach against the state of the art techniques and show that it outperforms these techniques both in terms of averaged achieved reward and also the convergence time.
引用
收藏
页码:1443 / 1454
页数:12
相关论文
共 50 条
  • [21] NEW APPROACH TO FUZZY ADAPTIVE EQUALIZER
    OH, DG
    CHOI, JY
    LEE, CW
    ELECTRONICS LETTERS, 1995, 31 (15) : 1269 - 1270
  • [22] A new approach in fuzzy adaptive filtering
    Seng, KP
    Man, ZH
    Wu, HR
    FUZZY LOGIC: FRAMEWORK FOR THE NEW MILLENNIUM, 2002, 81 : 277 - 287
  • [23] Experimental study of the eligibility traces in complex valued reinforcement learning
    Shibuya, Takeshi
    Shimada, Shingo
    Harnagarni, Tomoki
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 2458 - 2463
  • [24] Efficient ant reinforcement learning using replacing eligibility traces
    Lee, SeungGwan
    Hong, SeokMi
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2006, PROCEEDINGS, 2006, 4029 : 823 - 832
  • [25] On the efficient implementation biologic reinforcement learning using eligibility traces
    Lee, SeungGwan
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 476 - 481
  • [26] Adaptive Laser Welding Control: A Reinforcement Learning Approach
    Masinelli, Giulio
    Tri Le-Quang
    Zanoli, Silvio
    Wasmer, Kilian
    Shevchik, Sergey A.
    IEEE ACCESS, 2020, 8 : 103803 - 103814
  • [27] Adaptive Hybrid Synchronization Primitives: A Reinforcement Learning Approach
    Ganjaliyev, Fadai
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (05) : 51 - 57
  • [28] Water allocation improvement in river basin using Adaptive Neural Fuzzy Reinforcement Learning approach
    Abolpour, B.
    Javan, M.
    Karamouz, M.
    APPLIED SOFT COMPUTING, 2007, 7 (01) : 265 - 285
  • [29] Heterogeneous trading strategies with adaptive fuzzy Actor-Critic reinforcement learning: A behavioral approach
    Bekiros, Stelios D.
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2010, 34 (06): : 1153 - 1170
  • [30] A reinforcement learning approach to adaptive remediation in online training
    Spain, Randall
    Rowe, Jonathan
    Smith, Andy
    Goldberg, Benjamin
    Pokorny, Robert
    Mott, Bradford
    Lester, James
    JOURNAL OF DEFENSE MODELING AND SIMULATION-APPLICATIONS METHODOLOGY TECHNOLOGY-JDMS, 2022, 19 (02): : 173 - 193