Deep Reinforcement Learning for Nash Equilibrium of Differential Games

被引：2

作者：

Li, Zhenyu ^{[1
,2
]}

Luo, Yazhong ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Aerosp Sci & Engn, Changsha 410073, Peoples R China

[2] Beijing Inst Tracking & Telecommun Technol, Beijing 100094, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年

基金：

中国国家自然科学基金;

关键词：

Games; Nash equilibrium; Differential games; Reinforcement learning; Heuristic algorithms; Mathematical models; Artificial neural networks; Deep reinforcement learning (DRL); differential games; spacecraft pursuit-evasion; symplectic policy gradient theorem; ALGORITHM; LEVEL;

D O I：

10.1109/TNNLS.2024.3351631

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Nash equilibrium is a significant solution concept representing the optimal strategy in an uncooperative multiagent system. This study presents two deep reinforcement learning (DRL) algorithms for solving the Nash equilibrium of differential games. Both algorithms are built upon the distributed distributional deep deterministic policy gradient (D4PG) algorithm, which is a one-sided learning method. We modified it to a two-sided adversarial learning method. The first is D4PG for games (D4P2G), which directly applies an adversarial play framework based on the D4PG. A simultaneous policy gradient descent (SPGD) method is employed to optimize the policies of the players with conflicting objectives. The second is the distributional deep deterministic symplectic policy gradient (D4SPG) algorithm, which is our main contribution. More specifically, it newly designs a minimax learning framework that combines the critics of the two players and proposes a symplectic policy gradient adjustment method to find a better policy gradient. Simulations show that both algorithms converge to the Nash equilibrium in most cases, but D4SPG can learn the Nash equilibrium more accurately and efficiently, especially in Hamiltonian games. Moreover, it can handle games with complex dynamics, which is challenging for traditional methods.

引用

页码：1 / 15

页数：15

共 50 条

[41] Social Learning Algorithms Reaching Nash Equilibrium in Symmetric Cournot Games
Protopapas, Mattheos K.
Battaglia, Francesco
Kosmatopoulos, Elias B.
[J]. APPLICATIONS OF EVOLUTIONARY COMPUTATION, PT I, PROCEEDINGS, 2010, 6024 : 191 - +
[42] Nash equilibrium realization of population games based on social learning processes
Xing, Zhiyan
Yang, Yanlong
Hu, Zuopeng
[J]. MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (09) : 17116 - 17137
[43] Does rational learning lead to Nash equilibrium in finitely repeated games?
Sandroni, A
[J]. JOURNAL OF ECONOMIC THEORY, 1998, 78 (01) : 195 - 218
[44] Computational Performance of Deep Reinforcement Learning to Find Nash Equilibria
Graf, Christoph
Zobernig, Viktor
Schmidt, Johannes
Kloeckl, Claude
[J]. COMPUTATIONAL ECONOMICS, 2024, 63 (02) : 529 - 576
[45] UNIQUENESS OF NASH EQUILIBRIUM FOR LINEAR-CONVEX STOCHASTIC DIFFERENTIAL-GAMES
WERNERFELT, B
[J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1987, 53 (01) : 133 - 138
[46] Convergence of reinforcement learning to Nash equilibrium: A search-market experiment
Darmon, E
Waldeck, R
[J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2005, 355 (01) : 119 - 130
[47] Discontinuous Nash equilibrium points for nonzero-sum stochastic differential games
Hamadene, Said
Mu, Rui
[J]. STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2020, 130 (11) : 6901 - 6926
[48] Nash and Stackelberg differential games
Bensoussan, Alain
Frehse, Jens
Vogelgesang, Jens
[J]. CHINESE ANNALS OF MATHEMATICS SERIES B, 2012, 33 (03) : 317 - 332
[49] Nash and Stackelberg differential games
Alain Bensoussan
Jens Frehse
Jens Vogelgesang
[J]. Chinese Annals of Mathematics, Series B, 2012, 33 : 317 - 332
[50] Nash and Stackelberg Differential Games
Alain BENSOUSSAN
Jens FREHSE
Jens VOGELGESANG
[J]. Chinese Annals of Mathematics,Series B, 2012, 33 (03) : 317 - 332

← 1 2 3 4 5 →