An Associative State-Space Metric for Learning in Factored MDPs

被引:0
|
作者
Sequeira, Pedro [1 ,2 ]
Melo, Francisco S. [1 ,2 ]
Paiva, Ana [1 ,2 ]
机构
[1] Univ Tecn Lisboa, INESC ID, Av Prof Dr Cavaco Silva, P-2744016 Porto Salvo, Portugal
[2] Univ Tecn Lisboa, Inst Super Tecn, P-2744016 Porto Salvo, Portugal
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose a novel associative metric based on the classical conditioning paradigm that, much like what happens in nature, identifies associations between stimuli perceived by a learning agent while interacting with the environment. We use an associative tree structure to identify associations between the perceived stimuli and use this structure to measure the degree of similarity between states in factored Markov decision problems. Our approach provides a state-space metric that requires no prior knowledge on the structure of the underlying decision problem and is designed to be learned online, i.e., as the agent interacts with its environment. Our metric is thus amenable to application in reinforcement learning (RL) settings, allowing the learning agent to generalize its experience to unvisited states and improving the overall learning performance. We illustrate the application of our method in several problems of varying complexity and show that our metric leads to a performance comparable to that obtained with other well-studied metrics that require full knowledge of the decision problem.
引用
收藏
页码:163 / 174
页数:12
相关论文
共 50 条
  • [1] Coordinated learning in multiagent MDPs with infinite state-space
    Francisco S. Melo
    M. Isabel Ribeiro
    [J]. Autonomous Agents and Multi-Agent Systems, 2010, 21 : 321 - 367
  • [2] Coordinated learning in multiagent MDPs with infinite state-space
    Melo, Francisco S.
    Ribeiro, M. Isabel
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2010, 21 (03) : 321 - 367
  • [3] Efficient reinforcement learning in factored MDPs
    Kearns, M
    Koller, D
    [J]. IJCAI-99: PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 & 2, 1999, : 740 - 747
  • [4] Model-based reinforcement learning in factored-state MDPs
    Strehl, Alexander L.
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 103 - 110
  • [5] Scalable Initial State Interdiction for Factored MDPs
    Panda, Swetasudha
    Vorobeychik, Yevgeniy
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4801 - 4807
  • [6] TeXDYNA: Hierarchical Reinforcement Learning in Factored MDPs
    Kozlova, Olga
    Sigaud, Olivier
    Meyer, Christophe
    [J]. FROM ANIMALS TO ANIMATS 11, 2010, 6226 : 489 - +
  • [7] Polynomial Time Reinforcement Learning in Factored State MDPs with Linear Value Functions
    Deng, Zihao
    Devic, Siddartha
    Juba, Brendan
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [8] FACTORED STATE-SPACE REALIZATION OF TWO-DIMENSIONAL SYSTEMS
    BENMAHAMMED, K
    [J]. IEE PROCEEDINGS-D CONTROL THEORY AND APPLICATIONS, 1988, 135 (06): : 421 - 425
  • [9] Distance Metric Approximation for State-Space RRTs using Supervised Learning
    Bharatheesha, Mukunda
    Caarls, Wouter
    Wolfslag, Wouter Jan
    Wisse, Martijn
    [J]. 2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014), 2014, : 252 - 257
  • [10] Solving factored MDPs with hybrid state and action variables
    Kveton, Branislav
    Hauskrecht, Milos
    Guestrin, Carlos
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2006, 27 (153-201): : 153 - 201