DEEP REINFORCEMENT LEARNING WITH SPARSE DISTRIBUTED MEMORY FOR "WATER WORLD" PROBLEM SOLVING

被引:0
|
作者
Novotarskyi, M. A. [1 ]
Stirenko, S. G. [1 ]
Gordienko, Y. G. [1 ]
Kuzmych, V. A. [1 ]
机构
[1] Natl Tech Univ Ukraine, Igor Sikorsky Kyiv Polytech Inst, Dept Comp Engn, Kiev, Ukraine
关键词
Deep Reinforcement Learning; DQN-algorithm; Sparse Distributed Memory; Water World" problem;
D O I
10.15588/1607-3274-2021-1-14
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Context. Machine learning is one of the actively developing areas of data processing. Reinforcement learning is a class of machine learning methods where the problem involves mapping the sequence of environmental states to agent's actions. Significant progress in this area has been achieved using DQN-algorithms, which became one of the first classes of stable algorithms for learning using deep neural networks. The main disadvantage of this approach is the rapid growth of RAM in real-world tasks. The approach proposed in this paper can partially solve this problem. Objective. The aim is to develop a method of forming the structure and nature of access to the sparse distributed memory with increased information content to improve reinforcement learning without additional memory. Method. A method of forming the structure and modification of sparse distributed memory for storing previous transitions of the actor in the form of prototypes is proposed. The method allows increasing the informativeness of the stored data and, as a result, to improve the process of creating a model of the studied process by intensifying the learning of the deep neural network. Increasing the informativeness of the stored data is the result of this sequence of actions. First, we compare the new transition and the last saved transition. To perform this comparison, this method introduces a rate estimate for the distance between transitions. If the distance between the new transition and the last saved transition is smaller than the specified threshold, the new transition is written in place of the previous one without increasing the amount of memory. Otherwise, we create a new prototype in memory while deleting the prototype that has been stored in memory the longest. Results. The work of the proposed method was studied during the solution of the popular "Water World" test problem. The results showed a 1.5-times increase in the actor's survival time in a hostile environment. This result was achieved by increasing the informativeness of the stored data without increasing the amount of RAM. Conclusions. The proposed method of forming and modifying the structure of sparse distributed memory allowed to increase the informativeness of the stored data. As a result of this approach, improved reinforcement learning parameters on the example of the "Water World" problem by increasing the accuracy of the model of the physical process represented by a deep neural network.
引用
收藏
页码:136 / 143
页数:8
相关论文
共 50 条
  • [41] Distributed Emergent Agreements with Deep Reinforcement Learning
    Schmid, Kyrill
    Mueller, Robert
    Belzner, Lenz
    Tochtermann, Johannes
    Linhoff-Popien, Claudia
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [42] SPARSE ANETT FOR SOLVING INVERSE PROBLEMS WITH DEEP LEARNING
    Obmann, Daniel
    Linh Nguyen
    Schwab, Johannes
    Haltmeier, Markus
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING WORKSHOPS (IEEE ISBI WORKSHOPS 2020), 2020,
  • [43] Selective Activation of Distributed Memory Resources in Problem Solving
    Li, Songqing
    Gao, Fangfang
    Zhao, Qingbai
    Zhou, Zhijin
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2016, 51 : 578 - 578
  • [44] Deep reinforcement learning approach for solving joint pricing and inventory problem with reference price effects
    Zhou, Qiang
    Yang, Yefei
    Fu, Shaochuan
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 195
  • [45] Solving the Symmetric Eigenvalue Problem on distributed memory systems
    Giménez, D
    Majado, MJ
    Verdú, I
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-III, PROCEEDINGS, 1997, : 744 - 747
  • [46] Solving Time-Dependent Traveling Salesman Problem with Time Windows with Deep Reinforcement Learning
    Wu, Guojin
    Zhang, Zizhen
    Liu, Hong
    Wang, Jiahai
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 558 - 563
  • [47] A Reinforcement Learning Approach for Solving the Fragment Assembly Problem
    Bocicor, Maria-Iuliana
    Czibula, Gabriela
    Czibula, Istvan-Gergely
    13TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2011), 2012, : 191 - 198
  • [48] Solving GuanDan Poker Games with Deep Reinforcement Learning
    Ge Z.
    Xiang S.
    Tian P.
    Gao Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (01): : 145 - 155
  • [49] Reinforcement Learning for Solving Stochastic Vehicle Routing Problem
    Iklassov, Zangir
    Sobirov, Ikboljon
    Solozabal, Ruben
    Takac, Martin
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [50] Distributed-Memory Sparse Kernels for Machine Learning
    Bharadwaj, Vivek
    Buluc, Aydin
    Demmel, James
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 47 - 58