Reinforcement Learning in Latent Action Sequence Space

被引:3
|
作者
Kim, Heecheol [1 ]
Yamada, Masanori [2 ]
Miyoshi, Kosuke [3 ]
Iwata, Tomoharu [4 ]
Yamakawa, Hiroshi [5 ]
机构
[1] Univ Tokyo, Grad Sch Informat Sci & Technol, Lab Intelligent Syst & Informat, Tokyo, Japan
[2] Nippon Telegraph & Tel Corp, Secure Platform Labs, Tokyo, Japan
[3] Narrat Nights Inc, Yokohama, Kanagawa, Japan
[4] Nippon Telegraph & Tel Corp, Commun Sci Labs, Tokyo, Japan
[5] Dwango Artificial Intelligence Lab, Tokyo, Japan
关键词
Reinforcement Learning; Transfer Learning; Learning from Demonstration;
D O I
10.1109/IROS45743.2020.9341629
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One problem in real-world applications of reinforcement learning is the high dimensionality of the action search spaces, which comes from the combination of actions over time. To reduce the dimensionality of action sequence search spaces, macro actions have been studied, which are sequences of primitive actions to solve tasks. However, previous studies relied on humans to define macro actions or assumed macro actions to be repetitions of the same primitive actions. We propose encoded action sequence reinforcement learning (EASRL), a reinforcement learning method that learns flexible sequences of actions in a latent space for a high-dimensional action sequence search space. With EASRL, encoder and decoder networks are trained with demonstration data by using variational autoencoders for mapping macro actions into the latent space. Then, we learn a policy network in the latent space, which is a distribution over encoded macro actions given a state. By learning in the latent space, we can reduce the dimensionality of the action sequence search space and handle various patterns of action sequences. We experimentally demonstrate that the proposed method outperforms other reinforcement learning methods on tasks that require an extensive amount of search.
引用
收藏
页码:5497 / 5503
页数:7
相关论文
共 50 条
  • [21] Dynamic Action Space Handling Method for Reinforcement Learning models
    Woo, Sangchul
    Sung, Yunsick
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (05): : 1223 - 1230
  • [22] Reinforcement Learning with Latent Flow
    Shang, Wenling
    Wang, Xiaofei
    Srinivas, Aravind
    Rajeswaran, Aravind
    Gao, Yang
    Abbeel, Pieter
    Laskin, Michael
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [23] A latent space method with maximum entropy deep reinforcement learning for data assimilation
    Zhang, Jinding
    Zhang, Kai
    Wang, Zhongzheng
    Zhou, Wensheng
    Liu, Chen
    Zhang, Liming
    Ma, Xiaopeng
    Liu, Piyang
    Bian, Ziwei
    Kang, Jinzheng
    Yang, Yongfei
    Yao, Jun
    Geoenergy Science and Engineering, 2024, 243
  • [24] Feudal Latent Space Exploration for Coordinated Multi-Agent Reinforcement Learning
    Liu, Xiangyu
    Tan, Ying
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) : 7775 - 7783
  • [25] Model-Based Reinforcement Learning via Latent-Space Collocation
    Rybkin, Oleh
    Zhu, Chuning
    Nagabandi, Anusha
    Daniilidis, Kostas
    Mordatch, Igor
    Levine, Sergey
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [26] Evolutionary reinforcement learning with action sequence search for imperfect information games
    Wu, Xiaoqiang
    Zhu, Qingling
    Chen, Wei-Neng
    Lin, Qiuzhen
    Li, Jianqiang
    Coello, Carlos A. Coello
    INFORMATION SCIENCES, 2024, 676
  • [27] Algorithmic trading using continuous action space deep reinforcement learning
    Majidi, Naseh
    Shamsi, Mahdi
    Marvasti, Farokh
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [28] Deep reinforcement learning in continuous action space for autonomous robotic surgery
    Shahkoo, Amin Abbasi
    Abin, Ahmad Ali
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2023, 18 (03) : 423 - 431
  • [29] Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents
    Lee, Xian Yeow
    Ghadai, Sambit
    Tan, Kai Liang
    Hegde, Chinmay
    Sarkar, Soumik
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 4577 - 4584
  • [30] Constrained Portfolio Management Using Action Space Decomposition for Reinforcement Learning
    Winkel, David
    Strauss, Niklas
    Schubert, Matthias
    Ma, Yunpu
    Seidl, Thomas
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2023, PT II, 2023, 13936 : 373 - 385