Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

被引:0
|
作者
Hein, Daniel [1 ,2 ]
Udluft, Steffen [2 ]
Tokic, Michel [2 ]
Hentschel, Alexander [2 ]
Runkler, Thomas A. [1 ,2 ]
Sterzing, Volkmar [2 ]
机构
[1] Tech Univ Munich, Dept Informat, Boltzmannstr 3, D-85748 Garching, Germany
[2] Siemens AG, Corp Technol, Otto Hahn Ring 6, D-81739 Munich, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, such as continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.
引用
收藏
页码:4214 / 4221
页数:8
相关论文
共 50 条
  • [1] Benchmark study of reinforcement learning in controlling and optimizing batch processes
    Zhu, Wenbo
    Castillo, Ivan
    Wang, Zhenyu
    Rendall, Ricardo
    Chiang, Leo H.
    Hayot, Philippe
    Romagnoli, Jose A.
    [J]. Journal of Advanced Manufacturing and Processing, 2022, 4 (02)
  • [2] Reinforcement learning in batch processes
    Wilson, JA
    Martinez, EC
    [J]. APPLICATION OF NEURAL NETWORKS AND OTHER LEARNING TECHNOLOGIES IN PROCESS ENGINEERING, 2001, : 269 - 286
  • [3] COMPOSUITE: A COMPOSITIONAL REINFORCEMENT LEARNING BENCHMARK
    Mendez, Jorge A.
    Hussing, Marcel
    Gummadi, Meghna
    Eaton, Eric
    [J]. CONFERENCE ON LIFELONG LEARNING AGENTS, VOL 199, 2022, 199
  • [4] Reinforcement Learning for Batch-to-Batch Bioprocess Optimisation
    Petsagkourakis, P.
    Sandoval, I. Orson
    Bradford, E.
    Zhang, D.
    del Rio-Chanona, E. A.
    [J]. 29TH EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING, PT A, 2019, 46 : 919 - 924
  • [5] Batch Reinforcement Learning from Crowds
    Zhang, Guoxi
    Kashima, Hisashi
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 38 - 51
  • [6] Batch Reinforcement Learning with Hyperparameter Gradients
    Lee, Byung-Jun
    Lee, Jongmin
    Vrancx, Peter
    Kim, Dongho
    Kim, Kee-Eung
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [7] Reinforcement learning for batch bioprocess optimization
    Petsagkourakis, P.
    Sandoval, I. O.
    Bradford, E.
    Zhang, D.
    del Rio-Chanona, E. A.
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2020, 133
  • [8] Small batch deep reinforcement learning
    Obando-Ceron, Johan
    Bellemare, Marc G.
    Castro, Pablo Samuel
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Batch Prioritization in Multigoal Reinforcement Learning
    Vecchietti, Luiz Felipe
    Kim, Taeyoung
    Choi, Kyujin
    Hong, Junhee
    Har, Dongsoo
    [J]. IEEE ACCESS, 2020, 8 : 137449 - 137461
  • [10] Batch reinforcement learning with state importance
    Li, LH
    Bulitko, V
    Greiner, R
    [J]. MACHINE LEARNING: ECML 2004, PROCEEDINGS, 2004, 3201 : 566 - 568