Upper Confident Bound Fuzzy Q-learning and Its Application to a Video Game

被引：0

作者：

Morita, Takahiro ^{[1
]}

Hosobe, Hiroshi ^{[2
]}

机构：

[1] Hosei Univ, Grad Sch Comp & Informat Sci, Tokyo, Japan

[2] Hosei Univ, Fac Comp & Informat Sci, Tokyo, Japan

来源：

ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3 | 2022年

关键词：

Machine Learning; Fuzzy Q-learning; UCB Algorithm; Video Game;

D O I：

10.5220/0010835700003116

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes upper confident bound (UCB) fuzzy Q-learning by combining fuzzy Q-learning and the UCBQ algorithm and applies it to a video game. The UCBQ algorithm improved the action selection method called the UCB algorithm by applying it to Q-learning. The UCB algorithm selects the action with the highest UCB value instead of a value estimate. Since the UCB algorithm is based on the premise that any unselected actions are selected and value estimates are obtained, the number of unselected actions becomes small, and it is able to prevent local optimal solutions. The proposed method aims to promote the efficiency of learning by reducing unselected actions and preventing the Q value from becoming a local optimal solution in fuzzy Q-learning. This paper applies the proposed method to a video game called Ms. PacMan and presents the result of an experiment on finding optimum values in the method. Its evaluation is conducted by comparing the game scores with the scores obtained by a previous fuzzy Q-learning method. The result shows that the proposed method significantly reduced unselected actions.

引用

页码：454 / 461

页数：8

共 50 条

[41] Design of a fuzzy logic controller with Evolutionary Q-Learning
Kim, Min-Soeng
Lee, Ju-Jang
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2006, 12 (04): : 369 - 381
[42] Dynamic fuzzy Q-Learning and control of mobile robots
Deng, C
Er, MJ
Xu, J
2004 8TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1-3, 2004, : 2336 - 2341
[43] Fuzzy Rule Interpolation-based Q-learning
Vincze, David
Kovacs, Szilveszter
SACI: 2009 5TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS, 2009, : 45 - 49
[44] Initialization of Q-values by fuzzy rules for accelerating Q-learning
Oh, CH
Nakashima, T
Ishibuchi, H
IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2051 - 2056
[45] Dynamic scheduling with fuzzy clustering based Q-learning
Wang, Guo-Lei
Lin, Lin
Zhong, Shi-Sheng
Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2009, 15 (04): : 751 - 757
[46] Dynamic Fuzzy Q-Learning with Facility of Tuning and Removing Fuzzy Rules
Hosoya, Yu
Umano, Motohide
2012 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2012,
[47] Fuzzy Electricity Management System with Anomaly Detection and Fuzzy Q-Learning
Syu, Jia-Hao
Lin, Jerry Chun-Wei
Srivastava, Gautam
2023 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ, 2023,
[48] Implementation of Fuzzy Q-Learning Based on Modular Fuzzy Model and Parallel Structured Learning
Watanabe, Toshihiko
2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 1338 - 1344
[49] A Deep Q-Learning based approach applied to the Snake game
Sebastianelli, Alessandro
Tipaldi, Massimo
Ullo, Silvia Liberata
Glielmo, Luigi
2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 348 - 353
[50] Assessing the Potential of Classical Q-learning in General Game Playing
Wang, Hui
Emmerich, Michael
Plaat, Aske
ARTIFICIAL INTELLIGENCE, BNAIC 2018, 2019, 1021 : 138 - 150

← 1 2 3 4 5 →