Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game

被引:0
|
作者
Morita, Takahiro [1 ]
Hosobe, Hiroshi [2 ]
机构
[1] Hosei Univ, Grad Sch Comp & Informat Sci, Tokyo, Japan
[2] Hosei Univ, Fac Comp & Informat Sci, Tokyo, Japan
关键词
Machine Learning; Fuzzy Q-learning; UCB Algorithm; Video Game;
D O I
10.5220/0010835700003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes upper confident bound (UCB) fuzzy Q-learning by combining fuzzy Q-learning and the UCBQ algorithm and applies it to a video game. The UCBQ algorithm improved the action selection method called the UCB algorithm by applying it to Q-learning. The UCB algorithm selects the action with the highest UCB value instead of a value estimate. Since the UCB algorithm is based on the premise that any unselected actions are selected and value estimates are obtained, the number of unselected actions becomes small, and it is able to prevent local optimal solutions. The proposed method aims to promote the efficiency of learning by reducing unselected actions and preventing the Q value from becoming a local optimal solution in fuzzy Q-learning. This paper applies the proposed method to a video game called Ms. PacMan and presents the result of an experiment on finding optimum values in the method. Its evaluation is conducted by comparing the game scores with the scores obtained by a previous fuzzy Q-learning method. The result shows that the proposed method significantly reduced unselected actions.
引用
收藏
页码:454 / 461
页数:8
相关论文
共 50 条
  • [21] Q-learning intelligent jamming decision algorithm based on efficient upper confidence bound variance
    Rao N.
    Xu H.
    Song B.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2022, 54 (05): : 162 - 170
  • [22] Fuzzy Q-Learning with the modified fuzzy ART neural network
    Ueda, Hiroaki
    Naraki, Takeshi
    Hanada, Naoki
    Kimoto, Hideaki
    Takahashi, Kenichi
    Miyahara, Tetsuhiro
    Web Intelligence and Agent Systems, 2007, 5 (03): : 331 - 341
  • [23] Fuzzy Q-learning with the modified fuzzy ART neural network
    Ueda, H
    Hanada, N
    Kimoto, H
    Naraki, T
    Takahashi, K
    Miyahara, T
    2005 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2005, : 308 - 315
  • [24] The Improvement of Q-learning Applied to Imperfect Information Game
    Lin, Jing
    Wang, Xuan
    Han, Lijiao
    Zhang, Jiajia
    Xi, Xinxin
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1562 - +
  • [25] Double Q-learning Agent for Othello Board Game
    Somasundaram, Thamarai Selvi
    Panneerselvam, Karthikeyan
    Bhuthapuri, Tarun
    Mahadevan, Harini
    Jose, Ashik
    2018 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2018, : 216 - 223
  • [26] Evolution of cooperation in the public goods game with Q-learning
    Zheng, Guozhong
    Zhang, Jiqiang
    Deng, Shengfeng
    Cai, Weiran
    Chen, Li
    CHAOS SOLITONS & FRACTALS, 2024, 188
  • [27] Fuzzy adaptive Q-learning method with dynamic learning parameters
    Maeda, Y
    JOINT 9TH IFSA WORLD CONGRESS AND 20TH NAFIPS INTERNATIONAL CONFERENCE, PROCEEDINGS, VOLS. 1-5, 2001, : 2778 - 2780
  • [28] Story Creation Algorithm Using Q-Learning in a 2D Action RPG Video Game
    Fernandez-Samillan, Diego
    Guizado-Diaz, Carlos
    Ugarte, Willy
    PROCEEDINGS OF THE 28TH CONFERENCE OF OPEN INNOVATIONS ASSOCIATION FRUCT, 2021, : 111 - 117
  • [29] Q-learning for POMDP: An application to learning locomotion gaits
    Wang, Tixian
    Taghvaei, Amirhossein
    Mehta, Prashant G.
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 2758 - 2763
  • [30] Application of fuzzy Q-learning for electricity market modeling by considering renewable power penetration
    Salehizadeh, Mohammad Reza
    Soltaniyan, Salman
    RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2016, 56 : 1172 - 1181