Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game

被引:0
|
作者
Morita, Takahiro [1 ]
Hosobe, Hiroshi [2 ]
机构
[1] Hosei Univ, Grad Sch Comp & Informat Sci, Tokyo, Japan
[2] Hosei Univ, Fac Comp & Informat Sci, Tokyo, Japan
关键词
Machine Learning; Fuzzy Q-learning; UCB Algorithm; Video Game;
D O I
10.5220/0010835700003116
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes upper confident bound (UCB) fuzzy Q-learning by combining fuzzy Q-learning and the UCBQ algorithm and applies it to a video game. The UCBQ algorithm improved the action selection method called the UCB algorithm by applying it to Q-learning. The UCB algorithm selects the action with the highest UCB value instead of a value estimate. Since the UCB algorithm is based on the premise that any unselected actions are selected and value estimates are obtained, the number of unselected actions becomes small, and it is able to prevent local optimal solutions. The proposed method aims to promote the efficiency of learning by reducing unselected actions and preventing the Q value from becoming a local optimal solution in fuzzy Q-learning. This paper applies the proposed method to a video game called Ms. PacMan and presents the result of an experiment on finding optimum values in the method. Its evaluation is conducted by comparing the game scores with the scores obtained by a previous fuzzy Q-learning method. The result shows that the proposed method significantly reduced unselected actions.
引用
收藏
页码:454 / 461
页数:8
相关论文
共 50 条
  • [41] Design of a fuzzy logic controller with Evolutionary Q-Learning
    Kim, Min-Soeng
    Lee, Ju-Jang
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2006, 12 (04): : 369 - 381
  • [42] Dynamic fuzzy Q-Learning and control of mobile robots
    Deng, C
    Er, MJ
    Xu, J
    2004 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1-3, 2004, : 2336 - 2341
  • [43] Fuzzy Rule Interpolation-based Q-learning
    Vincze, David
    Kovacs, Szilveszter
    SACI: 2009 5TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS, 2009, : 45 - 49
  • [44] Initialization of Q-values by fuzzy rules for accelerating Q-learning
    Oh, CH
    Nakashima, T
    Ishibuchi, H
    IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2051 - 2056
  • [45] Dynamic scheduling with fuzzy clustering based Q-learning
    Wang, Guo-Lei
    Lin, Lin
    Zhong, Shi-Sheng
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2009, 15 (04): : 751 - 757
  • [46] Dynamic Fuzzy Q-Learning with Facility of Tuning and Removing Fuzzy Rules
    Hosoya, Yu
    Umano, Motohide
    2012 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2012,
  • [47] Fuzzy Electricity Management System with Anomaly Detection and Fuzzy Q-Learning
    Syu, Jia-Hao
    Lin, Jerry Chun-Wei
    Srivastava, Gautam
    2023 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ, 2023,
  • [48] Implementation of Fuzzy Q-Learning Based on Modular Fuzzy Model and Parallel Structured Learning
    Watanabe, Toshihiko
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1338 - 1344
  • [49] A Deep Q-Learning based approach applied to the Snake game
    Sebastianelli, Alessandro
    Tipaldi, Massimo
    Ullo, Silvia Liberata
    Glielmo, Luigi
    2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 348 - 353
  • [50] Assessing the Potential of Classical Q-learning in General Game Playing
    Wang, Hui
    Emmerich, Michael
    Plaat, Aske
    ARTIFICIAL INTELLIGENCE, BNAIC 2018, 2019, 1021 : 138 - 150