A reinforcement learning algorithm with fuzzy approximation for semi Markov decision problems

被引：3

作者：

Kula, Ufuk ^{[1
]}

Ocaktan, Beyazit ^{[2
]}

机构：

[1] Sakarya Univ, Dept Ind Engn, TR-54187 Sakarya, Turkey

[2] Balikesir Univ, Dept Ind Engn, Balikesir, Turkey

来源：

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS | 2015年 / 28卷 / 04期

关键词：

Fuzzy approximation; ANFIS; reinforcement learning; SMDPs; ANFIS;

D O I：

10.3233/IFS-141460

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Real life stochastic problems are generally large-scale, difficult to model, and therefore, suffer from the curses of dimensionality. Such problems cannot be solved by classical optimization methods. This paper presents a reinforcement learning algorithm using a fuzzy inference system, ANFIS to find an approximate solution for semi Markov decision problems (SMDPs). The performance of the developed algorithm is measured and compared to a classical reinforcement algorithm, SMART in a numerical example. Our numerical examples show that the developed algorithm converges significantly faster as the problem size increases and the average cost calculated by the algorithm gets closer to that of SMART as number of epochs used in the developed algorithm is increased.

引用

页码：1733 / 1744

页数：12

共 50 条

[1] An Inverse Reinforcement Learning Algorithm for semi-Markov Decision Processes
Tan, Chuanfang
Li, Yanjie
Cheng, Yuhu
2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 1256 - 1261
[2] Solving semi-Markov decision problems using average reward reinforcement learning
Dept. Indust. and Mgmt. Syst. Eng., University of South Florida, Tampa, FL 33620, United States
不详
不详
Manage Sci, 4 (560-574):
[3] Solving semi-Markov decision problems using average reward reinforcement learning
Das, TK
Gosavi, A
Mahadevan, S
Marchalleck, N
MANAGEMENT SCIENCE, 1999, 45 (04) : 560 - 574
[4] A reinforcement learning based algorithm for Markov decision processes
Bhatnagar, S
Kumar, S
2005 International Conference on Intelligent Sensing and Information Processing, Proceedings, 2005, : 199 - 204
[5] Reinforcement learning algorithm for partially observable Markov decision processes
Wang, Xue-Ning
He, Han-Gen
Xu, Xin
Kongzhi yu Juece/Control and Decision, 2004, 19 (11): : 1263 - 1266
[6] Average Reward Reinforcement Learning for Semi-Markov Decision Processes
Yang, Jiayuan
Li, Yanjie
Chen, Haoyao
Li, Jiangang
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 768 - 777
[7] A policy gradient reinforcement learning algorithm with fuzzy function approximation
Gu, DB
Yang, EF
IEEE ROBIO 2004: Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2004, : 936 - 940
[8] A reinforcement learning based algorithm for finite horizon Markov decision processes
Bhatnagar, Shalabh
Abdulla, Mohammed Shahid
PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5519 - 5524
[9] RVI Reinforcement Learning for Semi-Markov Decision Processes with Average Reward
Li, Yanjie
Cao, Fang
2010 8TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2010, : 1674 - 1679
[10] Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes
Sharma, Rajneesh
Spaan, Matthijs T. J.
IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1422 - 1429

← 1 2 3 4 5 →