Multi-Agent Reinforcement Learning Algorithm with Variable Optimistic-Pessimistic Criterion

被引：1

作者：

Akchurina, Natalia ^{[1
]}

机构：

[1] Univ Gesamthsch Paderborn, Int Grad Sch Dynam Intelligent Syst, D-4790 Paderborn, Germany

来源：

ECAI 2008, PROCEEDINGS | 2008年 / 178卷

关键词：

D O I：

10.3233/978-1-58603-891-5-433

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

A reinforcement learning algorithm for multi-agent systems based on variable Hurwicz's optimistic-pessimistic criterion is proposed. The formal proof of its convergence is given. Hurwicz's criterion allows to embed initial knowledge of how friendly the environment in which the agent is supposed to function will be. Thorough testing of the developed algorithm against well-known reinforcement learning algorithms has shown that in many cases its successful performance can be explained by its tendency to force the other agents to follow the policy which is more profitable for it. In addition the variability of Hurwicz's criterion allowed it to converge to best-response against opponents with stationary policies.

引用

页码：433 / +

页数：2

共 50 条

[1] Optimistic-Pessimistic Q-Learning Algorithm for Multi-Agent Systems
Akchurina, Natalia
MULTIAGENT SYSTEM TECHNOLOGIES, PROCEEDINGS, 2008, 5244 : 13 - 24
[2] Optimistic sequential multi-agent reinforcement learning with motivational communication
Huang, Anqi
Wang, Yongli
Zhou, Xiaoliang
Zou, Haochen
Dong, Xu
Che, Xun
NEURAL NETWORKS, 2024, 179
[3] Optimistic Value Instructors for Cooperative Multi-Agent Reinforcement Learning
Li, Chao
Zhang, Yupeng
Wang, Jianqi
Hu, Yujing
Dong, Shaokang
Li, Wenbin
Lv, Tangjie
Fan, Changjie
Gao, Yang
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17453 - 17460
[4] Cross-Observability Optimistic-Pessimistic Safe Reinforcement Learning for Interactive Motion Planning With Visual Occlusion
Hou, Xiaohui
Gan, Minggang
Wu, Wei
Ji, Yuan
Zhao, Shiyue
Chen, Jie
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 17602 - 17613
[5] Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
Zhao, Xutong
Pan, Yangchen
Xiao, Chenjun
Chandar, Sarath
Rajendran, Janarthanan
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2529 - 2540
[6] Cautiously-Optimistic Knowledge Sharing for Cooperative Multi-Agent Reinforcement Learning
Ba, Yanwen
Liu, Xuan
Chen, Xinning
Wang, Hao
Xu, Yang
Li, Kenli
Zhang, Shigeng
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17299 - 17307
[7] LMRL: A multi-agent reinforcement learning model and algorithm
Wang, BN
Gao, Y
Chen, ZQ
Xie, JY
Chen, SF
Third International Conference on Information Technology and Applications, Vol 1, Proceedings, 2005, : 303 - 307
[8] Sequence to Sequence Multi-agent Reinforcement Learning Algorithm
Shi T.
Wang L.
Huang Z.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (03): : 206 - 213
[9] A new accelerating algorithm for multi-agent reinforcement learning
张汝波
仲宇
顾国昌
Journal of Harbin Institute of Technology, 2005, (01) : 48 - 51
[10] Multi-Agent Reinforcement Learning
Stankovic, Milos
2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43

← 1 2 3 4 5 →