Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior

被引:0
|
作者
Lin, Baihan [1 ]
Bouneffouf, Djallel [2 ]
Cecchi, Guillermo [2 ]
机构
[1] Columbia Univ, New York, NY USA
[2] IBM Res, Yorktown Hts, NY USA
关键词
Online learning; Bandits; Contextual bandits; Reinforcement learning; Iterated Prisoner's Dilemma; Behavioral modeling; REINFORCEMENT; COOPERATION; PREDICTION; MODELS;
D O I
10.1007/978-3-031-20868-3_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As an important psychological and social experiment, the Iterated Prisoner's Dilemma (IPD) treats the choice to cooperate or defect as an atomic action. We propose to study the behaviors of online learning algorithms in the Iterated Prisoner's Dilemma (IPD) game, where we investigate the full spectrum of reinforcement learning agents: multi-armed bandits, contextual bandits and reinforcement learning. We evaluate them based on a tournament of iterated prisoner's dilemma where multiple agents can compete in a sequential fashion. This allows us to analyze the dynamics of policies learned by multiple self-interested independent reward-driven agents, and also allows us study the capacity of these algorithms to fit the human behaviors. Results suggest that considering the current situation to make decision is the worst in this kind of social dilemma game. Multiples discoveries on online learning behaviors and clinical validations are stated, as an effort to connect artificial intelligence algorithms with human behaviors and their abnormal states in neuropsychiatric conditions.
引用
收藏
页码:134 / 147
页数:14
相关论文
共 50 条
  • [1] Social Behavior in the Simulation of Iterated Prisoner's Dilemma
    Zhang, Hong-Wei
    Zhou, Kuan-Kuan
    Hu, Neng-Bing
    [J]. OPERATIONS RESEARCH AND ITS APPLICATIONS, PROCEEDINGS, 2009, 10 : 46 - 52
  • [2] Evolution and Incremental Learning in the Iterated Prisoner's Dilemma
    Quek, Han-Yang
    Tan, Kay Chen
    Goh, Chi-Keong
    Abbass, Hussein A.
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2009, 13 (02) : 303 - 320
  • [3] Heterogeneous Strategy Learning in the Iterated Prisoner's Dilemma
    Rangoni, Ruggero
    [J]. ETICA & POLITICA, 2013, 15 (02): : 42 - 57
  • [4] Learning versus evolution in iterated prisoner's dilemma
    Hingston, P
    Kendall, G
    [J]. CEC2004: PROCEEDINGS OF THE 2004 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2004, : 364 - 372
  • [5] Causal Reinforcement Learning in Iterated Prisoner's Dilemma
    Kazemi, Yosra
    Chanel, Caroline P. C.
    Givigi, Sidney
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (02) : 2523 - 2534
  • [6] Multiagent reinforcement learning in the Iterated Prisoner's Dilemma
    Sandholm, TW
    Crites, RH
    [J]. BIOSYSTEMS, 1996, 37 (1-2) : 147 - 166
  • [7] Adapation of Iterated Prisoner's Dilemma strategies by evolution and learning
    Quek, Han Yang
    Goh, Chi Keong
    [J]. 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES, 2007, : 40 - 47
  • [8] Asynchronous Iterated Prisoner's Dilemma
    Newth, David
    [J]. ADAPTIVE BEHAVIOR, 2009, 17 (02) : 175 - 183
  • [9] Domination in Iterated Prisoner's Dilemma
    Brown, Joseph Alexander
    Ashlock, Daniel A.
    [J]. 2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 1125 - 1128
  • [10] Human friendship favours cooperation in the iterated prisoner's dilemma
    Majolo, Bonaventura
    Ames, Kaye
    Brumpton, Rachel
    Garratt, Rebecca
    Hall, Kate
    Wilson, Natasha
    [J]. BEHAVIOUR, 2006, 143 : 1383 - 1395