Iterative ADP learning algorithms for discrete-time multi-player games

被引:53
|
作者
Jiang, He [1 ]
Zhang, Huaguang [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive dynamic programming; Approximate dynamic programming; Reinforcement learning; Neural network; ZERO-SUM GAMES; UNCERTAIN NONLINEAR-SYSTEMS; H-INFINITY CONTROL; CONSTRAINED-INPUT; POLICY ITERATION; EQUATION; DESIGNS;
D O I
10.1007/s10462-017-9603-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adaptive dynamic programming (ADP) is an important branch of reinforcement learning to solve various optimal control issues. Most practical nonlinear systems are controlled by more than one controller. Each controller is a player, and to make a tradeoff between cooperation and conflict of these players can be viewed as a game. Multi-player games are divided into two main categories: zero-sum game and non-zero-sum game. To obtain the optimal control policy for each player, one needs to solve Hamilton-Jacobi-Isaacs equations for zero-sum games and a set of coupled Hamilton-Jacobi equations for non-zero-sum games. Unfortunately, these equations are generally difficult or even impossible to be solved analytically. To overcome this bottleneck, two ADP methods, including a modified gradient-descent-based online algorithm and a novel iterative offline learning approach, are proposed in this paper. Furthermore, to implement the proposed methods, we employ single-network structure, which obviously reduces computation burden compared with traditional multiple-network architecture. Simulation results demonstrate the effectiveness of our schemes.
引用
收藏
页码:75 / 91
页数:17
相关论文
共 50 条
  • [21] Automatic Detection of Peer Interactions in Multi-player Learning Games
    Guinebert, Mathieu
    Yessad, Amel
    Muratet, Mathieu
    Luengo, Vanda
    TRANSFORMING LEARNING WITH MEANINGFUL TECHNOLOGIES, EC-TEL 2019, 2019, 11722 : 349 - 361
  • [22] Learning enables adaptation in cooperation for multi-player stochastic games
    Huang, Feng
    Cao, Ming
    Wang, Long
    JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2020, 17 (172)
  • [23] Inverse reinforcement learning for multi-player noncooperative apprentice games
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    AUTOMATICA, 2022, 145
  • [24] Experience Management in Multi-player Games
    Zhu, Jichen
    Ontanon, Santiago
    2019 IEEE CONFERENCE ON GAMES (COG), 2019,
  • [25] Deterministic multi-player Dynkin games
    Solan, E
    Vieille, N
    JOURNAL OF MATHEMATICAL ECONOMICS, 2003, 39 (08) : 911 - 929
  • [26] Concurrent Multi-Player Parity Games
    Malvone, Vadim
    Murano, Aniello
    Sorrentino, Loredana
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 689 - 697
  • [27] AN ANALYSIS OF UCT IN MULTI-PLAYER GAMES
    Sturtevant, Nathan
    ICGA JOURNAL, 2008, 31 (04) : 195 - 208
  • [28] An Analysis of UCT in Multi-player Games
    Sturtevant, Nathan R.
    COMPUTERS AND GAMES, 2008, 5131 : 37 - 49
  • [29] Hiding Actions in Multi-Player Games
    Malvone, Vadim
    Murano, Aniello
    Sorrentino, Loredana
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1205 - 1213
  • [30] Event-triggered optimal control for discrete-time multi-player non-zero-sum games using parallel control
    Lu, Jingwei
    Wei, Qinglai
    Wang, Ziyang
    Zhou, Tianmin
    Wang, Fei-Yue
    INFORMATION SCIENCES, 2022, 584 : 519 - 535