Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game

被引:0
|
作者
Morita, Takahiro [1 ]
Hosobe, Hiroshi [2 ]
机构
[1] Hosei Univ, Grad Sch Comp & Informat Sci, Tokyo, Japan
[2] Hosei Univ, Fac Comp & Informat Sci, Tokyo, Japan
关键词
Machine Learning; Fuzzy Q-learning; UCB Algorithm; Video Game;
D O I
10.5220/0010835700003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes upper confident bound (UCB) fuzzy Q-learning by combining fuzzy Q-learning and the UCBQ algorithm and applies it to a video game. The UCBQ algorithm improved the action selection method called the UCB algorithm by applying it to Q-learning. The UCB algorithm selects the action with the highest UCB value instead of a value estimate. Since the UCB algorithm is based on the premise that any unselected actions are selected and value estimates are obtained, the number of unselected actions becomes small, and it is able to prevent local optimal solutions. The proposed method aims to promote the efficiency of learning by reducing unselected actions and preventing the Q value from becoming a local optimal solution in fuzzy Q-learning. This paper applies the proposed method to a video game called Ms. PacMan and presents the result of an experiment on finding optimum values in the method. Its evaluation is conducted by comparing the game scores with the scores obtained by a previous fuzzy Q-learning method. The result shows that the proposed method significantly reduced unselected actions.
引用
收藏
页码:454 / 461
页数:8
相关论文
共 50 条
  • [31] Modeling and fuzzy Q-learning control of biped walking
    Meng Joo Er
    Yi Zhou
    Proceedings of the 24th Chinese Control Conference, Vols 1 and 2, 2005, : 641 - 646
  • [32] Accuracy based fuzzy Q-learning for robot behaviours
    Gu, DB
    Hu, HS
    2004 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, PROCEEDINGS, 2004, : 1455 - 1460
  • [33] A Hybrid Fuzzy Q-Learning algorithm for robot navigation
    Gordon, Sean W.
    Reyes, Napoleon H.
    Barczak, Andre
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 2625 - 2631
  • [34] Intelligent Fuzzy Q-Learning control of humanoid robots
    Er, MJ
    Zhou, Y
    ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 3, PROCEEDINGS, 2005, 3498 : 216 - 221
  • [35] Anomaly Detection using Fuzzy Q-learning Algorithm
    Shamshirband, Shahaboddin
    Anuar, Nor Badrul
    Kiah, Miss Laiha Mat
    Misra, Sanjay
    ACTA POLYTECHNICA HUNGARICA, 2014, 11 (08) : 5 - 28
  • [36] Automatic generation of fuzzy inference systems by dynamic fuzzy Q-Learning
    Deng, C
    Er, MJ
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 3206 - 3211
  • [37] Routing in VANETs: A Fuzzy Constraint Q-Learning Approach
    Wu, Celimuge
    Ohzahata, Satoshi
    Kato, Toshihiko
    2012 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2012, : 195 - 200
  • [38] Fuzzy Q-learning in continuous state and action space
    Xu M.-L.
    Xu W.-B.
    Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109
  • [39] Regenerative braking system modeling by fuzzy Q-Learning
    Maia, Ricardo
    Mendes, Jerome
    Araujo, Rui
    Silva, Marco
    Nunes, Urbano
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 93