Enhancing Reinforcement Learning Performance in Delayed Reward System Using DQN and Heuristics

被引:2
|
作者
Kim, Keecheon [1 ]
机构
[1] Konkuk Univ, Dept Comp Informat & Commun Engn, Seoul 05029, South Korea
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Games; Reinforcement learning; Shape; Q-learning; Licenses; Decision making; Visualization; Machine learning; reinforcement learning; heuristics; delayed reward system; Tetris;
D O I
10.1109/ACCESS.2022.3174361
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper suggests and implements how to apply the reinforcement learning on delayed reward system which is known to be complex to apply the machine learning technology such as Q-learning. Such games as Tetris game is known to be a delayed reward system because of its characteristics of generating sparse reward in learning process. Tetris game requires the actor's quick judgment ability and speed of response because the blocks must be stacked in an optimal location quickly, considering the random shape and rotation of appearing blocks. Also, since the number of cases is very large due to the various block types and order, if a human-being is playing the game, the performance is limited by simply relying on human memorization capability. Therefore, we applied a reinforcement learning implemented in this study for this delayed reward system. We find that the general legacy reinforcement learning method shows its limitation in improving the performance. Hence, we apply the heuristic to increase the decision accuracy as the weighting method of reward. As a result, we were able to obtain high scores in games. Although it is not yet possible to say that this heuristic(rule-based) approach has completely conquered the game. In several experiments, this hybrid reinforcement learning shows better playability than human in terms of learning speed, as well as high scores. In this paper, it is shown that general Q-learning is not suitable for delayed reward system. And a hybrid learning that adds prioritized experience replay tactics, and the related techniques and algorithms are introduced to increase the reinforcement learning performance.
引用
收藏
页码:50641 / 50650
页数:10
相关论文
共 50 条
  • [21] Application of Fuzzy Inference System to Average Reward Reinforcement Learning
    Chen, Wei
    Zhai, Zhenkun
    Li, Xiong
    Guo, Jing
    2009 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER SCIENCE, VOL 1, PROCEEDINGS, 2009, : 374 - 377
  • [22] Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance
    Knox, W. Bradley
    Stone, Peter
    ARTIFICIAL INTELLIGENCE, 2015, 225 : 24 - 50
  • [23] Exploring reward efficacy in traffic management using deep reinforcement learning in intelligent transportation system
    Paul, Ananya
    Mitra, Sulata
    ETRI JOURNAL, 2022, 44 (02) : 194 - 207
  • [24] Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation
    Yasui, Go
    Tsuruoka, Yoshimasa
    Nagata, Masaaki
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 400 - 406
  • [25] Reward Function Using Inverse Reinforcement Learning and Fuzzy Reasoning
    Kato, Yuta
    Kanoh, Masayoshi
    Nakamura, Tsuyoshi
    2020 JOINT 11TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 21ST INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS-ISIS), 2020, : 222 - 227
  • [26] Using Reward-Weighted Imitation for Robot Reinforcement Learning
    Peters, Jan
    Kober, Jens
    ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 226 - 232
  • [27] Reinforcement Learning using Reward Expectations in Scenarios with Aleatoric Uncertainties
    Wang, Yubin
    Sun, Yifeng
    Wu, Jiang
    Hu, Hao
    Wu, Zhiqiang
    Huang, Weigui
    2022 IEEE CONFERENCE ON GAMES, COG, 2022, : 261 - 267
  • [28] Hierarchical Reward Model of Deep Reinforcement Learning for Enhancing Cooperative Behavior in Automated Driving
    Matsuda, Kenji
    Suzuki, Tenta
    Harada, Tomohiro
    Matsuoka, Johei
    Tobisawa, Mao
    Hoshino, Jyunya
    Itoh, Yuuki
    Kumagae, Kaito
    Kagawa, Toshinori
    Hattori, Kiyohiko
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2024, 28 (02) : 431 - 443
  • [29] Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation
    Dayal, Aveen
    Cenkeramaddi, Linga Reddy
    Jha, Ajit
    APPLIED SOFT COMPUTING, 2022, 126
  • [30] Optimizing Reinforcement Learning Agents in Games Using Curriculum Learning and Reward Shaping
    Khan, Adil
    Muhammad, Muhammad
    Naeem, Muhammad
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2025, 36 (01)