Causal Reinforcement Learning in Iterated Prisoner's Dilemma

被引:2
|
作者
Kazemi, Yosra [1 ]
Chanel, Caroline P. C. [2 ]
Givigi, Sidney [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON K7L 2N8, Canada
[2] Univ Toulouse, Inst Super Aeronaut & Espace ISAE SUPAERO, Dept Design & Control Aerosp Vehicles, F-31013 Toulouse, France
关键词
~Causal inference; game theory; prisoner's dilemma (PD); reinforcement learning (RL); social dilemma;
D O I
10.1109/TCSS.2023.3289470
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The iterated prisoner's dilemma (IPD) is an archetypal paradigm to model cooperation and has guided studies on social dilemmas. In this work, we develop a causal reinforcement learning (CRL) strategy in a PD game. An agent is designed to have an explicit causal representation of other agents playing strategies from the Axelrod tournament. The collection of policies is assembled in an ensemble RL to choose the best strategy. The agent is then tested against selected Axelrod tournament strategies as well as an adaptive agent trained using traditional RL. Results show that our agent is able to play against all other players and score higher while being adaptive in situations where the strategy of the other players' changes. Furthermore, the decision taken by the agent can be explained in terms of the causal representation of the interactions. Based on the decision made by the agent, a human observer can understand the chosen strategy.
引用
收藏
页码:2523 / 2534
页数:12
相关论文
共 50 条
  • [41] The emergence of cooperation in asynchronous iterated prisoner's dilemma
    Cornforth, David
    Newth, David
    SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 742 - 749
  • [42] The Competitions of Forgiving Strategies in the Iterated Prisoner's Dilemma
    Binmad, Ruchdee
    Li, Mingchu
    Deonauth, Nakema
    Hungsapruek, Theerawat
    Limwudhikraijirath, Aree
    2018 IEEE INTERNATIONAL CONFERENCE ON AGENTS (ICA), 2018, : 39 - 43
  • [43] Social Behavior in the Simulation of Iterated Prisoner's Dilemma
    Zhang, Hong-Wei
    Zhou, Kuan-Kuan
    Hu, Neng-Bing
    OPERATIONS RESEARCH AND ITS APPLICATIONS, PROCEEDINGS, 2009, 10 : 46 - 52
  • [44] An exploration of differential utility in iterated prisoner's dilemma
    Ashlock, Dan
    Ashlock, Wendy
    Umphrey, Gary
    PROCEEDINGS OF THE 2006 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2006, : 271 - +
  • [45] Evolving continuous behaviors in the Iterated Prisoner's Dilemma
    Harrald, PG
    Fogel, DB
    BIOSYSTEMS, 1996, 37 (1-2) : 135 - 145
  • [46] Information sharing in the Iterated Prisoner's Dilemma game
    Ghoneim, Ayman
    Abbass, Hussein
    Barlow, Michael
    2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES, 2007, : 56 - 62
  • [47] Risk consideration and cooperation in the iterated prisoner's dilemma
    Zeng, Weijun
    Li, Minqiang
    Chen, Fuzan
    Nan, Guofang
    SOFT COMPUTING, 2016, 20 (02) : 567 - 587
  • [48] Evolutionary dynamics of the continuous iterated Prisoner's dilemma
    Le, Stephen
    Boyd, Robert
    JOURNAL OF THEORETICAL BIOLOGY, 2007, 245 (02) : 258 - 267
  • [49] New Winning Strategies for the Iterated Prisoner's Dilemma
    Mathieu, Philippe
    Delahaye, Jean-Paul
    JASSS-THE JOURNAL OF ARTIFICIAL SOCIETIES AND SOCIAL SIMULATION, 2017, 20 (04):
  • [50] Active Player Modeling in the Iterated Prisoner's Dilemma
    Park, Hyunsoo
    Kim, Kyung-Joong
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016