Enhancing Reinforcement Learning Performance in Delayed Reward System Using DQN and Heuristics

被引:2
|
作者
Kim, Keecheon [1 ]
机构
[1] Konkuk Univ, Dept Comp Informat & Commun Engn, Seoul 05029, South Korea
来源
IEEE ACCESS | 2022年 / 10卷
关键词
Games; Reinforcement learning; Shape; Q-learning; Licenses; Decision making; Visualization; Machine learning; reinforcement learning; heuristics; delayed reward system; Tetris;
D O I
10.1109/ACCESS.2022.3174361
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper suggests and implements how to apply the reinforcement learning on delayed reward system which is known to be complex to apply the machine learning technology such as Q-learning. Such games as Tetris game is known to be a delayed reward system because of its characteristics of generating sparse reward in learning process. Tetris game requires the actor's quick judgment ability and speed of response because the blocks must be stacked in an optimal location quickly, considering the random shape and rotation of appearing blocks. Also, since the number of cases is very large due to the various block types and order, if a human-being is playing the game, the performance is limited by simply relying on human memorization capability. Therefore, we applied a reinforcement learning implemented in this study for this delayed reward system. We find that the general legacy reinforcement learning method shows its limitation in improving the performance. Hence, we apply the heuristic to increase the decision accuracy as the weighting method of reward. As a result, we were able to obtain high scores in games. Although it is not yet possible to say that this heuristic(rule-based) approach has completely conquered the game. In several experiments, this hybrid reinforcement learning shows better playability than human in terms of learning speed, as well as high scores. In this paper, it is shown that general Q-learning is not suitable for delayed reward system. And a hybrid learning that adds prioritized experience replay tactics, and the related techniques and algorithms are introduced to increase the reinforcement learning performance.
引用
收藏
页码:50641 / 50650
页数:10
相关论文
共 50 条
  • [1] Enhancing Reinforcement Learning Performance in Delayed Reward System Using DQN and Heuristics
    Kim, Keecheon
    IEEE Access, 2022, 10 : 50641 - 50650
  • [2] IMMEDIATE REINFORCEMENT IN DELAYED REWARD LEARNING IN PIGEONS
    WINTER, J
    PERKINS, CC
    JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR, 1982, 38 (02) : 169 - 179
  • [3] THE ROLE OF SECONDARY REINFORCEMENT IN DELAYED REWARD LEARNING
    SPENCE, KW
    PSYCHOLOGICAL REVIEW, 1947, 54 (01) : 1 - 8
  • [4] CONDITIONED (SECONDARY) REINFORCEMENT AND DELAYED REWARD LEARNING
    PERKINS, CC
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1981, 18 (02) : 57 - 57
  • [5] THE RELATION OF SECONDARY REINFORCEMENT TO DELAYED REWARD IN VISUAL DISCRIMINATION LEARNING
    GRICE, GR
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1948, 38 (01): : 1 - 16
  • [6] Learning classifier system with average reward reinforcement learning
    Zang, Zhaoxiang
    Li, Dehua
    Wang, Junying
    Xia, Dan
    KNOWLEDGE-BASED SYSTEMS, 2013, 40 : 58 - 71
  • [7] Multi-Agent Deep Q Network to Enhance the Reinforcement Learning for Delayed Reward System
    Kim, Keecheon
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [8] Applying Double DQN to Reinforcement learning of Automated Designing ICT System
    Okamura, Natsuki
    Yakuwa, Yutaka
    Kuroda, Takayuki
    Yairi, Ikuko E.
    IEICE COMMUNICATIONS EXPRESS, 2022, 11 (10): : 667 - 672
  • [9] Applying Double DQN to Reinforcement learning of Automated Designing ICT System
    Okamura, Natsuki
    Yakuwa, Yutaka
    Kuroda, Takayuki
    Yairi, Ikuko E.
    IEICE COMMUNICATIONS EXPRESS, 2022,
  • [10] Enhancing Reinforcement Learning Finetuned Text-to-Image Generative Model Using Reward Ensemble
    Back, Kyungryul
    Piao, XinYu
    Kim, Jong-Kook
    GENERATIVE INTELLIGENCE AND INTELLIGENT TUTORING SYSTEMS, PT II, ITS 2024, 2024, 14799 : 213 - 224