Unsupervised Modeling of Partially Observable Environments

被引:0
|
作者
Graziano, Vincent [1 ]
Koutnik, Jan [1 ]
Schmidhuber, Juergen [1 ]
机构
[1] Univ Lugano, SUPSI, IDSIA, CH-6928 Manno, Switzerland
关键词
Self-Organizing Maps; POMDPs; Reinforcement Learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an architecture based on self-organizing maps for learning a sensory layer in a learning system. The architecture, temporal network for transitions (TNT), enjoys the freedoms of unsupervised learning, works on-line, in non-episodic environments, is computationally light, and scales well. TNT generates a predictive model of its internal representation of the world, making planning methods available for both the exploitation and exploration of the environment. Experiments demonstrate that TNT learns nice representations of classical reinforcement learning mazes of varying size (up to 20 x 20) under conditions of high-noise and stochastic actions.
引用
收藏
页码:503 / 515
页数:13
相关论文
共 50 条
  • [1] Unsupervised Object-Based Transition Models for 3D Partially Observable Environments
    Creswell, Antonia
    Kabra, Rishabh
    Burgess, Christopher
    Shanahan, Murray
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [2] Inverse reinforcement learning in partially observable environments
    Choi, Jaedeug
    Kim, Kee-Eung
    [J]. Journal of Machine Learning Research, 2011, 12 : 691 - 730
  • [3] A Recursive Classifier System for Partially Observable Environments
    Hamzeh, Ali
    Hashemi, Sattar
    Sami, Ashkan
    Rahmani, Adel
    [J]. FUNDAMENTA INFORMATICAE, 2009, 97 (1-2) : 15 - 40
  • [4] Inverse Reinforcement Learning in Partially Observable Environments
    Choi, Jaedeug
    Kim, Kee-Eung
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 691 - 730
  • [5] Influence Maximization Under Partially Observable Environments
    Ghafouri, Saeid
    Khasteh, Seyed Hossein
    [J]. 2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019), 2019, : 1984 - 1988
  • [6] Inverse Reinforcement Learning in Partially Observable Environments
    Choi, Jaedeug
    Kim, Kee-Eung
    [J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1028 - 1033
  • [7] Business Process Compliance in Partially Observable Environments
    Esperanca, Isabel
    Sousa, Pedro
    Guerreiro, Sergio
    [J]. ADVANCES IN ENTERPRISE ENGINEERING XIII, EEWC 2019, 2020, 374 : 3 - 14
  • [8] Planning with continuous actions in partially observable environments
    Spaan, MTJ
    Vlassis, N
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-4, 2005, : 3458 - 3463
  • [9] Decision-making in partially observable environments
    Guerreiro, Sergio
    [J]. 2014 IEEE 16TH CONFERENCE ON BUSINESS INFORMATICS (CBI), VOL 1, 2014, : 159 - 166
  • [10] Unsupervised State Representation Learning in Partially Observable Atari Games
    Meng, Li
    Goodwin, Morten
    Yazidi, Anis
    Engelstad, Paal
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2023, PT II, 2023, 14185 : 212 - 222