Fuzzy Reinforcement Learning Control for Decentralized Partially Observable Markov Decision Processes

被引：0

作者：

Sharma, Rajneesh ^{[1
]}

Spaan, Matthijs T. J. ^{[2
]}

机构：

[1] Netaji Subhas Inst Technol, Instrumentat & Control Div, New Delhi, India

[2] Inst Super Tecn, Inst Syst & Robot, Lisbon, Portugal

来源：

IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011) | 2011年

关键词：

Reinforcement learning; Fuzzy systems; Cooperative multiagent systems; Decentralized POMDPs;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) offer a powerful platform for optimizing sequential decision making in partially observable stochastic environments. However, finding optimal solutions for Dec-POMDPs is known to be intractable, necessitating approximate/suboptimal approaches. To address this problem, this work proposes a novel fuzzy reinforcement learning (RL) based game theoretic controller for Dec-POMDPs. The proposed controller implements fuzzy RL on Dec-POMDPs, which are modeled as a sequence of Bayesian games (BG). The main contributions of the work are the introduction of a game based RL paradigm in a Dec-POMDP settings, and the use of fuzzy inference systems to effectively generalize the underlying belief space. We apply the proposed technique on two benchmark problems and compare results against state-of-the-art Dec-POMDP control approach. The results validate the feasibility and effectiveness of using game theoretic RL based fuzzy control for addressing intractability of Dec-POMDPs, thus opening up a new research direction.

引用

页码：1422 / 1429

页数：8

共 50 条

[21] Policy Reuse for Learning and Planning in Partially Observable Markov Decision Processes
Wu, Bo
Feng, Yanpeng
2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2017, : 549 - 552
[22] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH PARTIALLY OBSERVABLE RANDOM DISCOUNT FACTORS
Martinez-Garcia, E. Everardo
Minjarez-Sosa, J. Adolfo
Vega-Amaya, Oscar
KYBERNETIKA, 2022, 58 (06) : 960 - 983
[23] Learning to control listening-oriented dialogue using partially observable markov decision processes
Meguro, Toyomi
Minami, Yasuhiro
Higashinaka, Ryuichiro
Dohsaka, Kohji
Meguro, T. (meguro.toyomi@lab.ntt.co.jp), 1600, Association for Computing Machinery (10):
[24] Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes
Haklidir, Mehmet
Temeltas, Hakan
IEEE ACCESS, 2021, 9 : 159672 - 159683
[25] Optimizing Spatial and Temporal Reuse in Wireless Networks by Decentralized Partially Observable Markov Decision Processes
Pajarinen, Joni
Hottinen, Ari
Peltonen, Jaakko
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2014, 13 (04) : 866 - 879
[26] Structural Estimation of Partially Observable Markov Decision Processes
Chang, Yanling
Garcia, Alfredo
Wang, Zhide
Sun, Lu
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (08) : 5135 - 5141
[27] Nonapproximability results for partially observable Markov decision processes
Lusena, C
Goldsmith, J
Mundhenk, M
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 83 - 113
[28] Entropy Maximization for Partially Observable Markov Decision Processes
Savas, Yagiz
Hibbard, Michael
Wu, Bo
Tanaka, Takashi
Topcu, Ufuk
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
[29] On Anderson Acceleration for Partially Observable Markov Decision Processes
Ermis, Melike
Park, Mingyu
Yang, Insoon
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 4478 - 4485
[30] Transition Entropy in Partially Observable Markov Decision Processes
Melo, Francisco S.
Ribeiro, Isabel
INTELLIGENT AUTONOMOUS SYSTEMS 9, 2006, : 282 - +

← 1 2 3 4 5 →