GENERAL PROOF OF CONVERGENCE OF THE NASH-Q-LEARNING ALGORITHM

被引:0
|
作者
Wang, Jun [1 ]
Cao, Lei [1 ]
Chen, Xiliang [1 ]
Lai, Jun [1 ]
机构
[1] Army Engn Univ PLA, Command Control Engn Inst, Nanjing 211101, Peoples R China
基金
中国国家自然科学基金;
关键词
Nash-Q-Learning; Game Theory; Schauder; Fractals;
D O I
10.1142/S0218348X2250027X
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this paper, the convergence of the Nash-Q-Learning algorithm will be studied mainly. In the previous proof of convergence, each stage of the game must have a global optimal point or a saddle point. Obviously, the assumption is so strict that there are not many application scenarios for the algorithm. At the same time, the algorithm can also get a convergent result in the two Grid-World Games, which do not meet the above assumptions. Thus, previous researchers proposed that the assumptions may be appropriately relaxed. However, a rigorous theoretical proof is not given. The convergence point is a fractal attractor from the view of Fractals, general proof of convergence of the Nash-Q-Learning algorithm will be shown by the mathematical method. Meanwhile, some discussions on the efficiency and scalability of the algorithm are also described in detail.
引用
下载
收藏
页数:9
相关论文
共 50 条
  • [21] NEW PROOF OF GLOBAL CONVERGENCE FOR TRIDIAGONAL QL ALGORITHM
    HOFFMANN, W
    PARLETT, BN
    SIAM JOURNAL ON NUMERICAL ANALYSIS, 1978, 15 (05) : 929 - 937
  • [22] A Generalization of Omura's Decoding Algorithm and a Proof of Convergence
    Axvig, Nathan
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2014, 60 (06) : 3292 - 3301
  • [23] Convergence Proof of a Class of Adaptive Ant Colony Algorithm
    Zhao, Baojiang
    PROCEEDINGS OF THE 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER, MECHATRONICS, CONTROL AND ELECTRONIC ENGINEERING (ICCMCEE 2015), 2015, 37 : 976 - 979
  • [24] NEW PROOF OF GLOBAL CONVERGENCE FOR TRIDIAGONAL QL ALGORITHM
    PARLETT, BN
    HOFFMAN, W
    SIAM REVIEW, 1978, 20 (03) : 632 - 632
  • [25] Proof of convergence for the distributed optimal rate assignment algorithm
    Rezaiifar, R
    Holtzman, J
    1999 IEEE 49TH VEHICULAR TECHNOLOGY CONFERENCE, VOLS 1-3: MOVING INTO A NEW MILLENIUM, 1999, : 1841 - 1845
  • [27] CONVERGENCE PROPERTIES OF LEARNING ALGORITHM
    BREIMAN, L
    WURTELE, ZS
    ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (04): : 1819 - &
  • [28] Learning convergence of CMAC algorithm
    He, C
    Xu, LX
    Zhang, YH
    NEURAL PROCESSING LETTERS, 2001, 14 (01) : 61 - 74
  • [29] Learning Convergence of CMAC Algorithm
    Chao He
    Lixin Xu
    Yuhe Zhang
    Neural Processing Letters, 2001, 14 : 61 - 74
  • [30] Hierarchical Nash-Q Learning in Continuous Games
    Sahraei-Ardakani, Mostafa
    Rahimi-Kian, Ashkan
    Nili-Ahmadabadi, Majid
    2008 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES, 2008, : 290 - 295