A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

被引:0
|
作者
Lei Zheng
Siu-Yeung Cho
机构
[1] Nanyang Technological University,School of Computer Engineering
来源
Neural Processing Letters | 2011年 / 33卷
关键词
Memory-based reinforcement learning; Markov decision processes; Partially observable Markov decision processes; Reinforcement learning;
D O I
暂无
中图分类号
学科分类号
摘要
Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process (MDP) using belief states. However, because the belief state space is continuous and multi-dimensional, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete POMDP model of the environment, which is not always practical. This article introduces a modified memory-based reinforcement learning algorithm called modified U-Tree that is capable of learning from raw sensor experiences with minimum prior knowledge. This article describes an enhancement of the original U-Tree’s state generation process to make the generated model more compact, and also proposes a modification of the statistical test for reward estimation, which allows the algorithm to be benchmarked against some traditional model-based algorithms with a set of well known POMDP problems.
引用
下载
收藏
页码:187 / 200
页数:13
相关论文
共 50 条
  • [41] Memory-based multi-population genetic learning for dynamic shortest path problems
    Diao, Yiya
    Li, Changhe
    Zeng, Sanyou
    Mavrovouniotis, Michalis
    Yang, Shengxiang
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2276 - 2283
  • [42] Memory-based learning of morphology with stochastic transducers
    Clark, A
    40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 513 - 520
  • [43] Memory-Based Learning: Using similarity for smoothing
    Zavrel, J
    Daelemans, W
    35TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 8TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 1997, : 436 - 443
  • [44] Developmental learning of memory-based perceptual models
    Ivanov, YA
    Blumberg, BM
    2ND INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, PROCEEDINGS, 2002, : 165 - 171
  • [45] ROBOT JUGGLING - IMPLEMENTATION OF MEMORY-BASED LEARNING
    SCHAAL, S
    ATKESON, CG
    IEEE CONTROL SYSTEMS MAGAZINE, 1994, 14 (01): : 57 - 71
  • [46] Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning
    Gu, Shenshen
    Zhuang, Yuxi
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (14): : 9973 - 9993
  • [47] Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning
    Shenshen Gu
    Yuxi Zhuang
    Neural Computing and Applications, 2023, 35 : 9973 - 9993
  • [48] Reinforcement Learning-Based Differential Evolution for Solving Economic Dispatch Problems
    Visutarrom, Thammarsat
    Chiang, Tsung-Che
    Konak, Abdullah
    Kulturel-Konak, Sadan
    2020 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEE IEEM), 2020, : 913 - 917
  • [49] A Deep Reinforcement Learning-Based Scheme for Solving Multiple Knapsack Problems
    Sur, Giwon
    Ryu, Shun Yuel
    Kim, JongWon
    Lim, Hyuk
    APPLIED SCIENCES-BASEL, 2022, 12 (06):