Fitted Q-Iteration via Max-Plus-Linear Approximation

被引:0
|
作者
Liu, Yichen [1 ]
Kolarijani, Mohamad Amin Sharifi [1 ]
机构
[1] Delft Univ Technol, Delft Ctr Syst & Control, NL-2628 CD Delft, Netherlands
来源
关键词
Approximation algorithms; Convergence; Vectors; Standards; Optimal control; Complexity theory; Algebra; Real-time systems; Neural networks; Medical services; Reinforcement learning; stochastic optimal control; computational methods;
D O I
10.1109/LCSYS.2024.3520060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this letter, we consider the application of max-plus-linear approximators for Q-function in offline reinforcement learning of discounted Markov decision processes. In particular, we incorporate these approximators to propose novel fitted Q-iteration (FQI) algorithms with provable convergence. Exploiting the compatibility of the Bellman operator with max-plus operations, we show that the max-plus-linear regression within each iteration of the proposed FQI algorithm reduces to simple max-plus matrix-vector multiplications. We also consider the variational implementation of the proposed algorithm which leads to a per-iteration complexity that is independent of the number of samples.
引用
收藏
页码:3201 / 3206
页数:6
相关论文
共 50 条
  • [31] Balancing comfort and energy consumption of a heat pump using batch reinforcement learning with fitted Q-iteration
    Vazquez-Canteli, Jose
    Kampf, Jerome
    Nagy, Zoltan
    CISBAT 2017 INTERNATIONAL CONFERENCE FUTURE BUILDINGS & DISTRICTS - ENERGY EFFICIENCY FROM NANO TO URBAN SCALE, 2017, 122 : 415 - 420
  • [32] Model predictive control for max-plus-linear systems: Linear programming solution
    Zou, Yuanyuan
    Li, Shaoyuan
    WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 339 - 343
  • [33] Exact and approximate approaches to the identification of stochastic max-plus-linear systems
    Samira S. Farahani
    Ton van den Boom
    Bart De Schutter
    Discrete Event Dynamic Systems, 2014, 24 : 447 - 471
  • [34] Stable Model Predictive Control for Constrained Max-Plus-Linear Systems
    Ion Necoara
    Bart De Schutter
    Ton J. J. van den Boom
    Hans Hellendoorn
    Discrete Event Dynamic Systems, 2007, 17 : 329 - 354
  • [35] Finite-horizon min-max control of max-plus-linear systems
    Necoara, Ion
    Kerrigan, Eric C.
    De Schutter, Bart
    van den Boom, Ton J. J.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (06) : 1088 - 1093
  • [36] Model predictive control for max-plus-linear discrete event systems
    De Schutter, B
    van den Boom, T
    AUTOMATICA, 2001, 37 (07) : 1049 - 1056
  • [37] Stable model predictive control for constrained max-plus-linear systems
    Necoara, Ion
    De Schutter, Bart
    van den Boom, Ton J. J.
    Hellendoorn, Hans
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2007, 17 (03): : 329 - 354
  • [38] A model predictive control for max-plus-linear systems with interval parameters
    Masuda, Shiro
    2006 SICE-ICASE International Joint Conference, Vols 1-13, 2006, : 5458 - 5461
  • [39] Exact and approximate approaches to the identification of stochastic max-plus-linear systems
    Farahani, Samira S.
    van den Boom, Ton
    De Schutter, Bart
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2014, 24 (04): : 447 - 471
  • [40] A Railway Timetable Scheduling Model based on a Max-Plus-Linear System
    Sagawa, Kyohei
    Yoshimura, Nozomi
    Shimakawa, Yoichi
    Goto, Hiroyuki
    2020 59TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2020, : 1575 - 1580