A reinforcement learning algorithm with fuzzy approximation for semi Markov decision problems

被引:3
|
作者
Kula, Ufuk [1 ]
Ocaktan, Beyazit [2 ]
机构
[1] Sakarya Univ, Dept Ind Engn, TR-54187 Sakarya, Turkey
[2] Balikesir Univ, Dept Ind Engn, Balikesir, Turkey
关键词
Fuzzy approximation; ANFIS; reinforcement learning; SMDPs; ANFIS;
D O I
10.3233/IFS-141460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real life stochastic problems are generally large-scale, difficult to model, and therefore, suffer from the curses of dimensionality. Such problems cannot be solved by classical optimization methods. This paper presents a reinforcement learning algorithm using a fuzzy inference system, ANFIS to find an approximate solution for semi Markov decision problems (SMDPs). The performance of the developed algorithm is measured and compared to a classical reinforcement algorithm, SMART in a numerical example. Our numerical examples show that the developed algorithm converges significantly faster as the problem size increases and the average cost calculated by the algorithm gets closer to that of SMART as number of epochs used in the developed algorithm is increased.
引用
收藏
页码:1733 / 1744
页数:12
相关论文
共 50 条
  • [1] An Inverse Reinforcement Learning Algorithm for semi-Markov Decision Processes
    Tan, Chuanfang
    Li, Yanjie
    Cheng, Yuhu
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 1256 - 1261
  • [2] Solving semi-Markov decision problems using average reward reinforcement learning
    Dept. Indust. and Mgmt. Syst. Eng., University of South Florida, Tampa, FL 33620, United States
    不详
    不详
    Manage Sci, 4 (560-574):
  • [3] Solving semi-Markov decision problems using average reward reinforcement learning
    Das, TK
    Gosavi, A
    Mahadevan, S
    Marchalleck, N
    MANAGEMENT SCIENCE, 1999, 45 (04) : 560 - 574
  • [4] A reinforcement learning based algorithm for Markov decision processes
    Bhatnagar, S
    Kumar, S
    2005 International Conference on Intelligent Sensing and Information Processing, Proceedings, 2005, : 199 - 204
  • [5] Reinforcement learning algorithm for partially observable Markov decision processes
    Wang, Xue-Ning
    He, Han-Gen
    Xu, Xin
    Kongzhi yu Juece/Control and Decision, 2004, 19 (11): : 1263 - 1266
  • [6] Average Reward Reinforcement Learning for Semi-Markov Decision Processes
    Yang, Jiayuan
    Li, Yanjie
    Chen, Haoyao
    Li, Jiangang
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 768 - 777
  • [7] A policy gradient reinforcement learning algorithm with fuzzy function approximation
    Gu, DB
    Yang, EF
    IEEE ROBIO 2004: Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2004, : 936 - 940
  • [8] A reinforcement learning based algorithm for finite horizon Markov decision processes
    Bhatnagar, Shalabh
    Abdulla, Mohammed Shahid
    PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5519 - 5524
  • [9] RVI Reinforcement Learning for Semi-Markov Decision Processes with Average Reward
    Li, Yanjie
    Cao, Fang
    2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 1674 - 1679
  • [10] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
    Sharma, Rajneesh
    Spaan, Matthijs T. J.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429