Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes

被引:0
|
作者
Sharma, Rajneesh [1 ]
Spaan, Matthijs T. J. [2 ]
机构
[1] Netaji Subhas Inst Technol, Instrumentat & Control Div, New Delhi, India
[2] Inst Super Tecn, Inst Syst & Robot, Lisbon, Portugal
关键词
Reinforcement learning; Fuzzy systems; Cooperative multiagent systems; Decentralized POMDPs;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) offer a powerful platform for optimizing sequential decision making in partially observable stochastic environments. However, finding optimal solutions for Dec-POMDPs is known to be intractable, necessitating approximate/suboptimal approaches. To address this problem, this work proposes a novel fuzzy reinforcement learning (RL) based game theoretic controller for Dec-POMDPs. The proposed controller implements fuzzy RL on Dec-POMDPs, which are modeled as a sequence of Bayesian games (BG). The main contributions of the work are the introduction of a game based RL paradigm in a Dec-POMDP settings, and the use of fuzzy inference systems to effectively generalize the underlying belief space. We apply the proposed technique on two benchmark problems and compare results against state-of-the-art Dec-POMDP control approach. The results validate the feasibility and effectiveness of using game theoretic RL based fuzzy control for addressing intractability of Dec-POMDPs, thus opening up a new research direction.
引用
收藏
页码:1422 / 1429
页数:8
相关论文
共 50 条
  • [41] LEARNING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES USING COUPLED CANONICAL POLYADIC DECOMPOSITION
    Huang, Kejun
    Yang, Zhuoran
    Wang, Zhaoran
    Hong, Mingyi
    2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 295 - 299
  • [42] Experimental results on learning stochastic memoryless policies for Partially Observable Markov Decision Processes
    Williams, JK
    Singh, S
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 1073 - 1079
  • [43] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES AND PERIODIC POLICIES WITH APPLICATIONS
    Goulionis, John
    Stengos, D.
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2011, 10 (06) : 1175 - 1197
  • [44] Partially observable Markov decision processes for spoken dialog systems
    Williams, Jason D.
    Young, Steve
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 393 - 422
  • [45] Nonmyopic multiaspect sensing with partially observable Markov decision processes
    Ji, Shihao
    Parr, Ronald
    Carin, Lawrence
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2007, 55 (06) : 2720 - 2730
  • [46] A Fast Approximation Method for Partially Observable Markov Decision Processes
    Bingbing Liu
    Yu Kang
    Xiaofeng Jiang
    Jiahu Qin
    Journal of Systems Science and Complexity, 2018, 31 : 1423 - 1436
  • [47] STRUCTURAL RESULTS FOR PARTIALLY OBSERVABLE MARKOV DECISION-PROCESSES
    ALBRIGHT, SC
    OPERATIONS RESEARCH, 1979, 27 (05) : 1041 - 1053
  • [48] Partially Observable Markov Decision Processes incorporating epistemic uncertainties
    Faddoul, R.
    Raphael, W.
    Soubra, A. -H.
    Chateauneuf, A.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 241 (02) : 391 - 401
  • [49] Qualitative Analysis of Partially-Observable Markov Decision Processes
    Chatterjee, Krishnendu
    Doyen, Laurent
    Henzinger, Thomas A.
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2010, 2010, 6281 : 258 - 269
  • [50] MEDICAL TREATMENTS USING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
    Goulionis, John E.
    JP JOURNAL OF BIOSTATISTICS, 2009, 3 (02) : 77 - 97