Model-Based Reinforcement Learning in Multiagent Systems with Sequential Action Selection

被引：0

作者：

Akramizadeh, Ali ^{[1
]}

Afshar, Ahmad ^{[1
]}

Menhaj, Mohammad Bagher ^{[1
]}

Jafari, Samira ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Computat Intelligence & Large Scale Syst Lab, Dept EE, Tehran, Iran

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2011年 / E94D卷 / 02期

关键词：

multiagent systems; Markov games; model-based reinforcement learning; extensive form game;

D O I：

10.1587/transinf.E94.D.255

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model-based reinforcement learning uses the gathered information, during each experience, more efficiently than model-free reinforcement learning. This is especially interesting in multiagent systems, since a large number of experiences are necessary to achieve a good performance. In this paper, model-based reinforcement learning is developed for a group of self-interested agents with sequential action selection based on traditional prioritized sweeping. Every single situation of decision making in this learning process, called extensive Markov game, is modeled as n-person general-sum extensive form game with perfect information. A modified version of backward induction is proposed for action selection, which adjusts the tradeoff between selecting subgame perfect equilibrium points, as the optimal joint actions, and learning new joint actions. The algorithm is proved to be convergent and discussed based on the new results on the convergence of the traditional prioritized sweeping.

引用

页码：255 / 263

页数：9

共 50 条

[1] Abstraction Selection in Model-Based Reinforcement Learning
Jiang, Nan
Kulesza, Alex
Singh, Satinder
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 179 - 188
[2] Sequential Monte Carlo Samplers for Model-Based Reinforcement Learning
Sonmez, Orhan
Cemgil, A. Taylan
[J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
[3] Model-based inverse reinforcement learning for deterministic systems
Self, Ryan
Abudia, Moad
Mahmud, S. M. Nahid
Kamalapurkar, Rushikesh
[J]. AUTOMATICA, 2022, 140
[4] Model-based hierarchical reinforcement learning and human action control
Botvinick, Matthew
Weinstein, Ari
[J]. PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2014, 369 (1655)
[5] Adaptive Multiagent Model Based on Reinforcement Learning for Distributed Generation Systems
Divenyi, Daniel
Dan, Andras
[J]. 2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 303 - 307
[6] Exploration strategies in n-Person general-sum multiagent reinforcement learning with sequential action selection
Akramizadeh, Ali
Afshar, Ahmad
Menhaj, Mohammad B.
[J]. INTELLIGENT DATA ANALYSIS, 2011, 15 (06) : 913 - 929
[7] An Analysis of Feature Selection and Reward Function for Model-Based Reinforcement Learning
Shen, Shitian
Lin, Chen
Mostafavi, Behrooz
Barnes, Tiffany
Chi, Min
[J]. INTELLIGENT TUTORING SYSTEMS, ITS 2016, 2016, 9684 : 504 - 505
[8] Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs
Kroon, Mark
Whiteson, Shimon
[J]. EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 324 - 330
[9] Model-Based Reinforcement Learning Exploiting State-Action Equivalence
Asadi, Mahsa
Talebi, Mohammad Sadegh
Bourel, Hippolyte
Maillard, Odalric-Ambrym
[J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 204 - 219
[10] Safe Model-Based Reinforcement Learning for Systems With Parametric Uncertainties
Mahmud, S. M. Nahid
Nivison, Scott A.
Bell, Zachary I.
Kamalapurkar, Rushikesh
[J]. FRONTIERS IN ROBOTICS AND AI, 2021, 8

← 1 2 3 4 5 →