Reinforcement learning for continuous action using stochastic gradient ascent

被引:0
|
作者
Kimura, H [1 ]
Kobayashi, S [1 ]
机构
[1] Tokyo Inst Technol, Midori Ku, Yokohama, Kanagawa 2268502, Japan
来源
INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5 | 1998年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper considers a reinforcement learning (RL) where the set of possible action is continuous and reward is considerably delayed. The proposed method is based on a stochastic gradient ascent with respect to the policy parameter space; it does not require a model of the environment to be given or learned, it does not need to approximate the value function explicitly, and it is incremental, requiring only a constant amount of computation per step. We demonstrate the behavior through a simple linear regulator problem and a cart-pole control problem.
引用
收藏
页码:288 / 295
页数:8
相关论文
共 50 条
  • [1] SA-SGA: Simulated Annealing Optimization and Stochastic Gradient Ascent Reinforcement Learning for Feature Selection
    Zandvakili, Aboozar
    Mansouri, Najme
    Javidi, Mohammad Masoud
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [2] Reinforcement Learning Using a Stochastic Gradient Method with Memory-Based Learning
    Yamada, Takafumi
    Yamaguchi, Satoshi
    ELECTRICAL ENGINEERING IN JAPAN, 2010, 173 (01) : 32 - 40
  • [3] Stochastic gradient ascent learning with spike timing dependent plasticity
    Joana Vieira
    Orlando Arévalo
    Klaus Pawelzik
    BMC Neuroscience, 12 (Suppl 1)
  • [4] Reinforcement learning with knowledge by using a stochastic gradient method on a Bayesian network
    Yamamura, M
    Onozuka, T
    IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2045 - 2050
  • [5] Reinforcement learning in continuous action spaces
    van Hasselt, Hado
    Wiering, Marco A.
    2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 272 - +
  • [6] Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
    Bolland, Adrien
    Boukas, Ioannis
    Berger, Mathias
    Ernst, Damien
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 73 : 117 - 171
  • [7] Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
    Bolland, Adrien
    Boukas, Ioannis
    Berger, Mathias
    Ernst, Damien
    Journal of Artificial Intelligence Research, 2022, 73 : 117 - 171
  • [8] Reinforcement learning for continuous stochastic control problems
    Munos, R
    Bourgine, P
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1029 - 1035
  • [9] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [10] Algorithmic trading using continuous action space deep reinforcement learning
    Majidi, Naseh
    Shamsi, Mahdi
    Marvasti, Farokh
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235