The Improvement of Q-learning Applied to Imperfect Information Game

被引:4
|
作者
Lin, Jing [1 ]
Wang, Xuan [1 ]
Han, Lijiao [2 ]
Zhang, Jiajia [1 ]
Xi, Xinxin [1 ]
机构
[1] HIT Shenzhen Grad Sch, Intelligence Comp Res Ctr, Shenzhen, Peoples R China
[2] Shenyang Univ Technol, Sch Management, Shenyang, Peoples R China
关键词
Q-learning; truncated TD; simulated annealing; imperfect information game;
D O I
10.1109/ICSMC.2009.5346316
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
There exist problems of slow convergence and local optimum in standard Q-learning algorithm. Truncated TO estimate returns efficiency and simulated annealing algorithm increase the chance of exploration. To accelerate the algorithm convergence speed and to avoid results in local optimum, this paper combines Q-learning algorithm, truncated TO estimation and simulated annealing algorithm. We apply improved Q-learning algorithm using into the imperfect information game (SiGuo military chess game), and realize a self-learning of imperfect information game system. Experimental outcomes show that this system can dynamically adjust each weight which describes game state according to the results. Further, it speeds up the process of learning, effectively simulates human intelligence and makes reasonable step, and significantly improves system performance.
引用
收藏
页码:1562 / +
页数:3
相关论文
共 50 条
  • [1] A Deep Q-Learning based approach applied to the Snake game
    Sebastianelli, Alessandro
    Tipaldi, Massimo
    Ullo, Silvia Liberata
    Glielmo, Luigi
    [J]. 2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 348 - 353
  • [2] Evolution of cooperation in the public goods game with Q-learning
    Zheng, Guozhong
    Zhang, Jiqiang
    Deng, Shengfeng
    Cai, Weiran
    Chen, Li
    [J]. Chaos, Solitons and Fractals, 2024, 188
  • [3] Double Q-learning Agent for Othello Board Game
    Somasundaram, Thamarai Selvi
    Panneerselvam, Karthikeyan
    Bhuthapuri, Tarun
    Mahadevan, Harini
    Jose, Ashik
    [J]. 2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 216 - 223
  • [4] QMIMC: Q-Learning Model Based on Imperfect-information under Multi-agent Crowdtesting
    Zhang, Jie
    Li, Kefan
    Zhang, Baoming
    Xu, Ming
    Wang, Chongjun
    [J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 1110 - 1117
  • [5] Information Theoretic Model Predictive Q-Learning
    Bhardwaj, Mohak
    Handa, Ankur
    Fox, Dieter
    Boots, Byron
    [J]. LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 840 - 850
  • [6] Defense decision-making method based on incomplete information stochastic game and Q-learning
    基于不完全信息随机博弈与Q-learning的防御决策方法
    [J]. Yang, Junnan (624519905@qq.com), 2018, Editorial Board of Journal on Communications (39):
  • [7] Q-Learning applied to the problem of scheduling on heterogeneous architectures
    Hajoui, Younes
    Bouattane, Omar
    Youssfi, Mohamed
    Illoussamen, Elhocein
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (02): : 153 - 159
  • [8] Deep Q-Learning in Robotics: Improvement of Accuracy and Repeatability
    Sumanas, Marius
    Petronis, Algirdas
    Bucinskas, Vytautas
    Dzedzickis, Andrius
    Virzonis, Darius
    Morkvenaite-Vilkonciene, Inga
    [J]. SENSORS, 2022, 22 (10)
  • [9] The emergence of cooperation via Q-learning in spatial donation game
    Zhang, Jing
    Rong, Zhihai
    Zheng, Guozhong
    Zhang, Jiqiang
    Chen, Li
    [J]. JOURNAL OF PHYSICS-COMPLEXITY, 2024, 5 (02):
  • [10] Assessing the Potential of Classical Q-learning in General Game Playing
    Wang, Hui
    Emmerich, Michael
    Plaat, Aske
    [J]. ARTIFICIAL INTELLIGENCE, BNAIC 2018, 2019, 1021 : 138 - 150