Reinforcement learning for continuous action using stochastic gradient ascent

被引：0

作者：

Kimura, H ^{[1
]}

Kobayashi, S ^{[1
]}

机构：

[1] Tokyo Inst Technol, Midori Ku, Yokohama, Kanagawa 2268502, Japan

来源：

INTELLIGENT AUTONOMOUS SYSTEMS: IAS-5 | 1998年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper considers a reinforcement learning (RL) where the set of possible action is continuous and reward is considerably delayed. The proposed method is based on a stochastic gradient ascent with respect to the policy parameter space; it does not require a model of the environment to be given or learned, it does not need to approximate the value function explicitly, and it is incremental, requiring only a constant amount of computation per step. We demonstrate the behavior through a simple linear regulator problem and a cart-pole control problem.

引用

页码：288 / 295

页数：8

共 50 条

[1] SA-SGA: Simulated Annealing Optimization and Stochastic Gradient Ascent Reinforcement Learning for Feature Selection
Zandvakili, Aboozar
Mansouri, Najme
Javidi, Mohammad Masoud
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
[2] Reinforcement Learning Using a Stochastic Gradient Method with Memory-Based Learning
Yamada, Takafumi
Yamaguchi, Satoshi
ELECTRICAL ENGINEERING IN JAPAN, 2010, 173 (01) : 32 - 40
[3] Stochastic gradient ascent learning with spike timing dependent plasticity
Joana Vieira
Orlando Arévalo
Klaus Pawelzik
BMC Neuroscience, 12 (Suppl 1)
[4] Reinforcement learning with knowledge by using a stochastic gradient method on a Bayesian network
Yamamura, M
Onozuka, T
IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE, 1998, : 2045 - 2050
[5] Reinforcement learning in continuous action spaces
van Hasselt, Hado
Wiering, Marco A.
2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 272 - +
[6] Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
Bolland, Adrien
Boukas, Ioannis
Berger, Mathias
Ernst, Damien
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 73 : 117 - 171
[7] Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
Bolland, Adrien
Boukas, Ioannis
Berger, Mathias
Ernst, Damien
Journal of Artificial Intelligence Research, 2022, 73 : 117 - 171
[8] Reinforcement learning for continuous stochastic control problems
Munos, R
Bourgine, P
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1029 - 1035
[9] Randomized Stochastic Gradient Descent Ascent
Sebbouh, Othmane
Cuturi, Marco
Peyre, Gabriel
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[10] Algorithmic trading using continuous action space deep reinforcement learning
Majidi, Naseh
Shamsi, Mahdi
Marvasti, Farokh
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235

← 1 2 3 4 5 →