Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning

被引:1
|
作者
Shokri, Matin [1 ]
Khasteh, Seyed Hossein [2 ]
Aminifar, Amin [1 ]
机构
[1] KN Toosi Univ Technol, Tehran N, Iran
[2] KN Toosi Univ Technol, Comp Engn Dept, Tehran, Iran
关键词
Reinforcement learning; Temporal difference; Watkinss Q; Fuzzy inference; Adaptive fuzzy Watkins (AFW);
D O I
10.1007/s40815-019-00633-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning is one of the most reliable methods, which have been used to solve many problems. One of the best reinforcement learning family methods are temporal difference methods. The most important weakness of reinforcement learning methods, such as temporal difference methods, is that these methods have slow convergence rate. Many studies are devoted to solving this problem. One of the proposed solutions to this problem is eligibility traces. Owing to the nature of off-policy methods, combining eligibility traces with off-policy methods requires special attention. In the early learning process for Watkins method (one of the dominant eligibility traces methods), cutting eligibility traces during exploratory actions results in diminishing benefits of eligibility traces method. In this study, we propose a framework to combine eligibility traces with off-policy methods. This research attempts to properly use the information explored during action exploration of the agent; to this end, the decision about applying the eligibility traces during the exploratory actions of the agent is made by means of fuzzy adaptation. We apply this method to find the goal state in the static and dynamic grid world. We compare our approach against the state of the art techniques and show that it outperforms these techniques both in terms of averaged achieved reward and also the convergence time.
引用
收藏
页码:1443 / 1454
页数:12
相关论文
共 50 条
  • [1] Adaptive Fuzzy Watkins: A New Adaptive Approach for Eligibility Traces in Reinforcement Learning
    Matin Shokri
    Seyed Hossein Khasteh
    Amin Aminifar
    International Journal of Fuzzy Systems, 2019, 21 : 1443 - 1454
  • [2] Adaptive Eligibility Traces for Online Deep Reinforcement Learning
    Kobayashi, Taisuke
    INTELLIGENT AUTONOMOUS SYSTEMS 16, IAS-16, 2022, 412 : 417 - 428
  • [3] Learning with eligibility traces in adaptive critic designs
    Xu, Jing
    Liang, Fu-Ming
    Yu, Wen-Sheng
    PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON VEHICULAR ELECTRONICS AND SAFETY, 2006, : 309 - +
  • [4] Adaptive and multiple time-scale eligibility traces for online deep reinforcement learning
    Kobayashi, Taisuke
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 151
  • [5] Adaptive Group-based Signal Control Using Reinforcement Learning with Eligibility Traces
    Jin, Junchen
    Ma, Xiaoliang
    2015 IEEE 18TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, : 2412 - 2417
  • [6] Multi-Agent Reinforcement Learning for Adaptive Routing: A Hybrid Method using Eligibility Traces
    Zeng, Siliang
    Xu, Xingfei
    Chen, Yi
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 1332 - 1339
  • [7] Reinforcement learning with replacing eligibility traces
    Singh, SP
    Sutton, RS
    MACHINE LEARNING, 1996, 22 (1-3) : 123 - 158
  • [8] Reinforcement learning based fuzzy adaptive controller
    Ma, Li
    Cai, Zixing
    Zhongnan Gongye Daxue Xuebao/Journal of Central South University of Technology, 29 (02): : 172 - 175
  • [9] Adaptive fuzzy command acquisition with reinforcement learning
    Lin, CT
    Kan, MC
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1998, 6 (01) : 102 - 121
  • [10] A reinforcement learning adaptive fuzzy controller for robots
    Lin, CK
    FUZZY SETS AND SYSTEMS, 2003, 137 (03) : 339 - 352