A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

被引:0
|
作者
Lei Zheng
Siu-Yeung Cho
机构
[1] Nanyang Technological University,School of Computer Engineering
来源
Neural Processing Letters | 2011年 / 33卷
关键词
Memory-based reinforcement learning; Markov decision processes; Partially observable Markov decision processes; Reinforcement learning;
D O I
暂无
中图分类号
学科分类号
摘要
Partially observable Markov decision processes (POMDP) provide a mathematical framework for agent planning under stochastic and partially observable environments. The classic Bayesian optimal solution can be obtained by transforming the problem into Markov decision process (MDP) using belief states. However, because the belief state space is continuous and multi-dimensional, the problem is highly intractable. Many practical heuristic based methods are proposed, but most of them require a complete POMDP model of the environment, which is not always practical. This article introduces a modified memory-based reinforcement learning algorithm called modified U-Tree that is capable of learning from raw sensor experiences with minimum prior knowledge. This article describes an enhancement of the original U-Tree’s state generation process to make the generated model more compact, and also proposes a modification of the statistical test for reward estimation, which allows the algorithm to be benchmarked against some traditional model-based algorithms with a set of well known POMDP problems.
引用
下载
收藏
页码:187 / 200
页数:13
相关论文
共 50 条
  • [21] Memory-based crowd-aware robot navigation using deep reinforcement learning
    Samsani, Sunil Srivatsav
    Mutahira, Husna
    Muhammad, Mannan Saeed
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (02) : 2147 - 2158
  • [22] A tabular approach memory-based learning
    Lin, C.-S. (linc@missouri.edu), 1600, Taylor and Francis Inc. (05):
  • [23] Memory-based learning for visual odometry
    Roberts, Richard
    Nguyen, Hai
    Krishnamurthi, Niyant
    Balch, Tucker
    2008 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-9, 2008, : 47 - 52
  • [24] BAYESIAN REINFORCEMENT LEARNING FOR POMDP-BASED DIALOGUE SYSTEMS
    Png, ShaoWei
    Pineau, Joelle
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 2156 - 2159
  • [25] Multiagent Reinforcement Learning: Rollout and Policy Iteration for POMDP With Application to Multirobot Problems
    Bhattacharya, Sushmita
    Kailas, Siva
    Badyal, Sahil
    Gil, Stephanie
    Bertsekas, Dimitri
    IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 : 2003 - 2023
  • [26] Estimate of current state based on experience in POMDP for Reinforcement Learning
    Miyazaki, Yoshiki
    Kurashige, Kentarou
    PROCEEDINGS OF THE SEVENTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 17TH '12), 2012, : 1135 - 1138
  • [27] Memory-based problem solving and schema induction in Go
    Heneveld, A
    Bundy, A
    Ramscar, M
    Richardson, J
    PROCEEDINGS OF THE TWENTY-SECOND ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 2000, : 226 - 231
  • [28] Implicit memory-based technique in solving dynamic scheduling problems through Response Surface Methodology - Part I Model and method
    Abello, Manuel Blanco
    Michalewicz, Zbigniew
    INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2014, 7 (02) : 111 - 139
  • [29] Reinforcement Learning for Solving Communication Problems in Shepherding
    Mohamed, Reem E.
    Elsayed, Saber
    Hunjet, Robert
    Abbass, Hussein
    2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 1626 - 1635
  • [30] Solving Safety Problems with Ensemble Reinforcement Learning
    Ferreira, Leonardo A.
    dos Santos, Thiago F.
    Bianchi, Reinaldo A. C.
    Santos, Paulo E.
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 203 - 214