Causal Reinforcement Learning in Iterated Prisoner's Dilemma

被引:2
|
作者
Kazemi, Yosra [1 ]
Chanel, Caroline P. C. [2 ]
Givigi, Sidney [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON K7L 2N8, Canada
[2] Univ Toulouse, Inst Super Aeronaut & Espace ISAE SUPAERO, Dept Design & Control Aerosp Vehicles, F-31013 Toulouse, France
关键词
~Causal inference; game theory; prisoner's dilemma (PD); reinforcement learning (RL); social dilemma;
D O I
10.1109/TCSS.2023.3289470
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The iterated prisoner's dilemma (IPD) is an archetypal paradigm to model cooperation and has guided studies on social dilemmas. In this work, we develop a causal reinforcement learning (CRL) strategy in a PD game. An agent is designed to have an explicit causal representation of other agents playing strategies from the Axelrod tournament. The collection of policies is assembled in an ensemble RL to choose the best strategy. The agent is then tested against selected Axelrod tournament strategies as well as an adaptive agent trained using traditional RL. Results show that our agent is able to play against all other players and score higher while being adaptive in situations where the strategy of the other players' changes. Furthermore, the decision taken by the agent can be explained in terms of the causal representation of the interactions. Based on the decision made by the agent, a human observer can understand the chosen strategy.
引用
收藏
页码:2523 / 2534
页数:12
相关论文
共 50 条
  • [21] Stationary strategies in iterated prisoner's dilemma
    Levchenkov V.S.
    Levchenkova L.G.
    Computational Mathematics and Modeling, 2006, 17 (3) : 254 - 273
  • [22] PREFERENCE AND EVOLUTION IN THE ITERATED PRISONER’S DILEMMA
    王先甲
    刘伟兵
    Acta Mathematica Scientia, 2009, (02) : 456 - 464
  • [23] Automata playing iterated Prisoner's Dilemma
    Benitez, Antonio
    REVISTA DE FILOSOFIA-MADRID, 2018, 43 (02): : 223 - 243
  • [24] Softening and Hardening in the Iterated Prisoner's Dilemma
    Mathieu, Philippe
    Delahaye, Jean-Paul
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (02): : 654 - 663
  • [25] Shopkeeper Strategies in the Iterated Prisoner's Dilemma
    Ashlock, Daniel
    Kuusela, Christopher
    Cojocaru, Monica
    2011 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2011, : 1063 - 1070
  • [26] Payoff Control in the Iterated Prisoner's Dilemma
    Hao, Dong
    Li, Kai
    Zhou, Tao
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 296 - 302
  • [27] Convergence of the iterated prisoner's dilemma game
    Dyer, M
    Goldberg, LA
    Greenhill, C
    Istrate, G
    Jerrum, M
    COMBINATORICS PROBABILITY & COMPUTING, 2002, 11 (02): : 135 - 147
  • [28] Evolving Cooperation for the Iterated Prisoner's Dilemma
    Finocchiaro, Jessica
    Mathias, H. David
    PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 199 - 200
  • [29] Reactive means in the iterated Prisoner's dilemma
    Molnar, Grant
    Hammond, Caroline
    Fu, Feng
    APPLIED MATHEMATICS AND COMPUTATION, 2023, 458
  • [30] Clans and Cooperation in the Iterated Prisoner's Dilemma
    Julstrom, Bryant A.
    PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), 2012, : 1463 - 1464