A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

被引：0

作者：

Lei Zheng

Siu-Yeung Cho

机构：

[1] Nanyang Technological University,School of Computer Engineering

来源：

Neural Processing Letters | 2011年 / 33卷

关键词：

Memory-based reinforcement learning; Markov decision processes; Partially observable Markov decision processes; Reinforcement learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process (MDP) using belief states. However, because the belief state space is continuous and multi-dimensional, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete POMDP model of the environment, which is not always practical. This article introduces a modified memory-based reinforcement learning algorithm called modified U-Tree that is capable of learning from raw sensor experiences with minimum prior knowledge. This article describes an enhancement of the original U-Tree’s state generation process to make the generated model more compact, and also proposes a modification of the statistical test for reward estimation, which allows the algorithm to be benchmarked against some traditional model-based algorithms with a set of well known POMDP problems.

引用

下载

页码：187 / 200

页数：13

共 50 条

[41] Memory-based multi-population genetic learning for dynamic shortest path problems
Diao, Yiya
Li, Changhe
Zeng, Sanyou
Mavrovouniotis, Michalis
Yang, Shengxiang
2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2276 - 2283
[42] Memory-based learning of morphology with stochastic transducers
Clark, A
40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 513 - 520
[43] Memory-Based Learning: Using similarity for smoothing
Zavrel, J
Daelemans, W
35TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 8TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 1997, : 436 - 443
[44] Developmental learning of memory-based perceptual models
Ivanov, YA
Blumberg, BM
2ND INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, PROCEEDINGS, 2002, : 165 - 171
[45] ROBOT JUGGLING - IMPLEMENTATION OF MEMORY-BASED LEARNING
SCHAAL, S
ATKESON, CG
IEEE CONTROL SYSTEMS MAGAZINE, 1994, 14 (01): : 57 - 71
[46] Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning
Gu, Shenshen
Zhuang, Yuxi
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (14): : 9973 - 9993
[47] Method for solving constrained 0-1 quadratic programming problems based on pointer network and reinforcement learning
Shenshen Gu
Yuxi Zhuang
Neural Computing and Applications, 2023, 35 : 9973 - 9993
[48] Reinforcement Learning-Based Differential Evolution for Solving Economic Dispatch Problems
Visutarrom, Thammarsat
Chiang, Tsung-Che
Konak, Abdullah
Kulturel-Konak, Sadan
2020 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEE IEEM), 2020, : 913 - 917
[49] A Deep Reinforcement Learning-Based Scheme for Solving Multiple Knapsack Problems
Sur, Giwon
Ryu, Shun Yuel
Kim, JongWon
Lim, Hyuk
APPLIED SCIENCES-BASEL, 2022, 12 (06):
[50] A deep reinforcement learning-based method applied for solving multi-agent defense and attack problems
Fu, Mingsheng (fms@uestc.edu.cn), 1600, Elsevier Ltd (176):

← 1 2 3 4 5 →