Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes

被引:0
|
作者
Sharma, Rajneesh [1 ]
Spaan, Matthijs T. J. [2 ]
机构
[1] Netaji Subhas Inst Technol, Instrumentat & Control Div, New Delhi, India
[2] Inst Super Tecn, Inst Syst & Robot, Lisbon, Portugal
关键词
Reinforcement learning; Fuzzy systems; Cooperative multiagent systems; Decentralized POMDPs;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) offer a powerful platform for optimizing sequential decision making in partially observable stochastic environments. However, finding optimal solutions for Dec-POMDPs is known to be intractable, necessitating approximate/suboptimal approaches. To address this problem, this work proposes a novel fuzzy reinforcement learning (RL) based game theoretic controller for Dec-POMDPs. The proposed controller implements fuzzy RL on Dec-POMDPs, which are modeled as a sequence of Bayesian games (BG). The main contributions of the work are the introduction of a game based RL paradigm in a Dec-POMDP settings, and the use of fuzzy inference systems to effectively generalize the underlying belief space. We apply the proposed technique on two benchmark problems and compare results against state-of-the-art Dec-POMDP control approach. The results validate the feasibility and effectiveness of using game theoretic RL based fuzzy control for addressing intractability of Dec-POMDPs, thus opening up a new research direction.
引用
收藏
页码:1422 / 1429
页数:8
相关论文
共 50 条
  • [21] Policy Reuse for Learning and Planning in Partially Observable Markov Decision Processes
    Wu, Bo
    Feng, Yanpeng
    2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2017, : 549 - 552
  • [22] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH PARTIALLY OBSERVABLE RANDOM DISCOUNT FACTORS
    Martinez-Garcia, E. Everardo
    Minjarez-Sosa, J. Adolfo
    Vega-Amaya, Oscar
    KYBERNETIKA, 2022, 58 (06) : 960 - 983
  • [23] Learning to control listening-oriented dialogue using partially observable markov decision processes
    Meguro, Toyomi
    Minami, Yasuhiro
    Higashinaka, Ryuichiro
    Dohsaka, Kohji
    Meguro, T. (meguro.toyomi@lab.ntt.co.jp), 1600, Association for Computing Machinery (10):
  • [24] Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes
    Haklidir, Mehmet
    Temeltas, Hakan
    IEEE ACCESS, 2021, 9 : 159672 - 159683
  • [25] Optimizing Spatial and Temporal Reuse in Wireless Networks by Decentralized Partially Observable Markov Decision Processes
    Pajarinen, Joni
    Hottinen, Ari
    Peltonen, Jaakko
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2014, 13 (04) : 866 - 879
  • [26] Structural Estimation of Partially Observable Markov Decision Processes
    Chang, Yanling
    Garcia, Alfredo
    Wang, Zhide
    Sun, Lu
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (08) : 5135 - 5141
  • [27] Nonapproximability results for partially observable Markov decision processes
    Lusena, C
    Goldsmith, J
    Mundhenk, M
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 83 - 113
  • [28] Entropy Maximization for Partially Observable Markov Decision Processes
    Savas, Yagiz
    Hibbard, Michael
    Wu, Bo
    Tanaka, Takashi
    Topcu, Ufuk
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
  • [29] On Anderson Acceleration for Partially Observable Markov Decision Processes
    Ermis, Melike
    Park, Mingyu
    Yang, Insoon
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 4478 - 4485
  • [30] Transition Entropy in Partially Observable Markov Decision Processes
    Melo, Francisco S.
    Ribeiro, Isabel
    INTELLIGENT AUTONOMOUS SYSTEMS 9, 2006, : 282 - +