Satisficing Paths and Independent Multiagent Reinforcement Learning in Stochastic Games

被引:1
|
作者
Yongacoglu, Bora [1 ]
Arslan, Gurdal [2 ]
Yuksel, Serdar [1 ]
机构
[1] Queens Univ, Dept Math & Stat, Kingston, ON, Canada
[2] Univ Hawaii Manoa, Dept Elect Engn, Honolulu, HI 96822 USA
来源
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE | 2023年 / 5卷 / 03期
关键词
multiagent reinforcement learning; independent learners; learning in games; stochastic games; decentralized systems; FICTITIOUS PLAY; UNCOUPLED DYNAMICS; CONVERGENCE; SYSTEMS; TEAMS; GO;
D O I
10.1137/22M1515112
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In multiagent reinforcement learning, independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For \epsilon \geq 0, an \epsilon -satisficing policy update rule is any rule that instructs the agent to not change its policy when it is \epsilon -best-responding to the policies of the remaining players; \epsilon -satisficing paths are defined to be sequences of joint policies obtained when each agent uses some \epsilon -satisficing policy update rule to select its next policy. We establish structural results on the existence of \epsilon -satisficing paths into \epsilon -equilibrium in both symmetric N-player games and general stochastic games with two players. We then present an independent learning algorithm for N-player symmetric games and give high probability guarantees of convergence to \epsilon -equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of \epsilon -satisficing paths.
引用
收藏
页码:745 / 773
页数:29
相关论文
共 50 条
  • [31] Multiagent Inverse Reinforcement Learning for Two-Person Zero-Sum Games
    Lin, Xiaomin
    Beling, Peter A.
    Cogill, Randy
    IEEE TRANSACTIONS ON GAMES, 2018, 10 (01) : 56 - 68
  • [32] Fast reinforcement learning using stochastic shortest paths for a mobile robot
    Kwon, Wooyoung
    Suh, Il Hong
    Lee, Sanghoon
    Cho, Young-Jo
    2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9, 2007, : 82 - +
  • [33] ON STEP SIZES, STOCHASTIC SHORTEST PATHS, AND SURVIVAL PROBABILITIES IN REINFORCEMENT LEARNING
    Gosavi, Abhijit
    2008 WINTER SIMULATION CONFERENCE, VOLS 1-5, 2008, : 525 - 531
  • [34] Lenient Learning in Independent-Learner Stochastic Cooperative Games
    Wei, Ermo
    Luke, Sean
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [35] Model-free Reinforcement Learning for Stochastic Stackelberg Security Games
    Mishra, Rajesh K.
    Vasal, Deepanshu
    Vishwanath, Sriram
    2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 348 - 353
  • [36] Simultaneously Learning and Advising in Multiagent Reinforcement Learning
    da Silva, Felipe Leno
    Glatt, Ruben
    Reali Costa, Anna Helena
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1100 - 1108
  • [37] Lateral Transfer Learning for Multiagent Reinforcement Learning
    Shi, Haobin
    Li, Jingchen
    Mao, Jiahui
    Hwang, Kao-Shing
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (03) : 1699 - 1711
  • [38] Learning to Teach in Cooperative Multiagent Reinforcement Learning
    Omidshafiei, Shayegan
    Kim, Dong-Ki
    Liu, Miao
    Tesauro, Gerald
    Riemer, Matthew
    Amato, Christopher
    Campbell, Murray
    How, Jonathan P.
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6128 - 6136
  • [39] Learning Cooperative Behaviours in Multiagent Reinforcement Learning
    Phon-Amnuaisuk, Somnuk
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 570 - 579
  • [40] Multiagent Learning in Large Anonymous Games
    Kash, Ian A.
    Friedman, Eric J.
    Halpern, Joseph Y.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2011, 40 : 571 - 598