Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes

被引:0
|
作者
Sharma, Rajneesh [1 ]
Spaan, Matthijs T. J. [2 ]
机构
[1] Netaji Subhas Inst Technol, Instrumentat & Control Div, New Delhi, India
[2] Inst Super Tecn, Inst Syst & Robot, Lisbon, Portugal
关键词
Reinforcement learning; Fuzzy systems; Cooperative multiagent systems; Decentralized POMDPs;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) offer a powerful platform for optimizing sequential decision making in partially observable stochastic environments. However, finding optimal solutions for Dec-POMDPs is known to be intractable, necessitating approximate/suboptimal approaches. To address this problem, this work proposes a novel fuzzy reinforcement learning (RL) based game theoretic controller for Dec-POMDPs. The proposed controller implements fuzzy RL on Dec-POMDPs, which are modeled as a sequence of Bayesian games (BG). The main contributions of the work are the introduction of a game based RL paradigm in a Dec-POMDP settings, and the use of fuzzy inference systems to effectively generalize the underlying belief space. We apply the proposed technique on two benchmark problems and compare results against state-of-the-art Dec-POMDP control approach. The results validate the feasibility and effectiveness of using game theoretic RL based fuzzy control for addressing intractability of Dec-POMDPs, thus opening up a new research direction.
引用
收藏
页码:1422 / 1429
页数:8
相关论文
共 50 条
  • [1] Decentralized Control of Partially Observable Markov Decision Processes
    Amato, Christopher
    Chowdhary, Girish
    Geramifard, Alborz
    Uere, N. Kemal
    Kochenderfer, Mykel J.
    2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 2398 - 2405
  • [2] Reinforcement learning algorithm for partially observable Markov decision processes
    Wang, Xue-Ning
    He, Han-Gen
    Xu, Xin
    Kongzhi yu Juece/Control and Decision, 2004, 19 (11): : 1263 - 1266
  • [3] A Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes
    Le, Tuyen P.
    Ngo Anh Vien
    Chung, Taechoong
    IEEE ACCESS, 2018, 6 : 49089 - 49102
  • [4] Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes
    Guo, Hongyi
    Cai, Qi
    Zhang, Yufeng
    Yang, Zhuoran
    Wang, Zhaoran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [5] Recursive learning automata for control of partially observable Markov decision processes
    Chang, Hyeong Soo
    Fu, Michael C.
    Marcus, Steven I.
    2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 6091 - 6096
  • [6] Active learning in partially observable Markov decision processes
    Jaulmes, R
    Pineau, J
    Precup, D
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
  • [7] A pulse neural network reinforcement learning algorithm for partially observable Markov decision processes
    Takita, Koichiro
    Hagiwara, Masafumi
    Systems and Computers in Japan, 2005, 36 (03): : 42 - 52
  • [8] Mixed reinforcement learning for partially observable Markov decision process
    Dung, Le Tien
    Komeda, Takashi
    Takagi, Motoki
    2007 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, 2007, : 436 - +
  • [9] Partially decentralized reinforcement learning in finite, multi-agent Markov decision processes
    Tilak, Omkar
    Mukhopadhyay, Snehasis
    AI COMMUNICATIONS, 2011, 24 (04) : 293 - 309
  • [10] CHQ: A multi-agent reinforcement learning scheme for partially observable Markov decision processes
    Osada, H
    Fujita, S
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2004, : 17 - 23