The Improvement of Q-learning Applied to Imperfect Information Game

被引:4
|
作者
Lin, Jing [1 ]
Wang, Xuan [1 ]
Han, Lijiao [2 ]
Zhang, Jiajia [1 ]
Xi, Xinxin [1 ]
机构
[1] HIT Shenzhen Grad Sch, Intelligence Comp Res Ctr, Shenzhen, Peoples R China
[2] Shenyang Univ Technol, Sch Management, Shenyang, Peoples R China
关键词
Q-learning; truncated TD; simulated annealing; imperfect information game;
D O I
10.1109/ICSMC.2009.5346316
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There exist problems of slow convergence and local optimum in standard Q-learning algorithm. Truncated TO estimate returns efficiency and simulated annealing algorithm increase the chance of exploration. To accelerate the algorithm convergence speed and to avoid results in local optimum, this paper combines Q-learning algorithm, truncated TO estimation and simulated annealing algorithm. We apply improved Q-learning algorithm using into the imperfect information game (SiGuo military chess game), and realize a self-learning of imperfect information game system. Experimental outcomes show that this system can dynamically adjust each weight which describes game state according to the results. Further, it speeds up the process of learning, effectively simulates human intelligence and makes reasonable step, and significantly improves system performance.
引用
收藏
页码:1562 / +
页数:3
相关论文
共 50 条
  • [21] Acquiring the positioning skill in a soccer game using a fuzzy Q-learning
    Nakashima, T
    Udo, M
    Ishibuchi, H
    [J]. 2003 IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, VOLS I-III, PROCEEDINGS, 2003, : 1488 - 1491
  • [22] Accelerating Nash Q-Learning with Graphical Game Representation and Equilibrium Solving
    Zhuang, Yunkai
    Chen, Xingguo
    Gao, Yang
    Hu, Yujing
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 939 - 946
  • [23] SIMULATION OF ENTERPRISE INFORMATION SECURITY DECISION-MAKING TRILATERAL GAME BASED ON Q-LEARNING AND CELLULAR AUTOMATA
    Liu, Xiang
    Liu, Jia
    Wen, Tianqi
    [J]. TRANSFORMATIONS IN BUSINESS & ECONOMICS, 2017, 16 (2B): : 742 - 754
  • [24] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    [J]. FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [25] Learning rates for Q-Learning
    Even-Dar, E
    Mansour, Y
    [J]. COMPUTATIONAL LEARNING THEORY, PROCEEDINGS, 2001, 2111 : 589 - 604
  • [26] Learning rates for Q-learning
    Even-Dar, E
    Mansour, Y
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 5 : 1 - 25
  • [27] Self-Learning PD Game With Imperfect Information on Networks
    Li, Zhuozheng
    Chu, Tianguang
    Wang, Long
    [J]. PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 6864 - 6869
  • [28] Contextual Q-Learning
    Pinto, Tiago
    Vale, Zita
    [J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 2927 - 2928
  • [29] Zap Q-Learning
    Devraj, Adithya M.
    Meyn, Sean P.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [30] CVaR Q-Learning
    Stanko, Silvestr
    Macek, Karel
    [J]. COMPUTATIONAL INTELLIGENCE: 11th International Joint Conference, IJCCI 2019, Vienna, Austria, September 17-19, 2019, Revised Selected Papers, 2021, 922 : 333 - 358