Online model-based reinforcement learning for decision-making in long distance routes

被引:2
|
作者
Alcaraz, Juan J. [1 ]
Losilla, Fernando [1 ]
Caballero-Arnaldos, Luis [1 ]
机构
[1] Tech Univ Cartagena UPCT, Dept Informat & Commun Technol, Cartagena, Spain
关键词
Route scheduling; Reinforcement learning; Model predictive control; Monte Carlo tree search; VEHICLE-ROUTING PROBLEM; TIME WINDOWS; STOCHASTIC TRAVEL; OPTIMIZATION; FRAMEWORK; SERVICE;
D O I
10.1016/j.tre.2022.102790
中图分类号
F [经济];
学科分类号
02 ;
摘要
In road transportation, long-distance routes require scheduled driving times, breaks, and restperiods, in compliance with the regulations on working conditions for truck drivers, whileensuring goods are delivered within the time windows of each customer. However, routes aresubject to uncertain travel and service times, and incidents may cause additional delays, makingpredefined schedules ineffective in many real-life situations. This paper presents a reinforcementlearning (RL) algorithm capable of making en-route decisions regarding driving times, breaks,and rest periods, under uncertain conditions. Our proposal aims at maximizing the likelihood ofon-time delivery while complying with drivers' work regulations. We use an online model-basedRL strategy that needs no prior training and is more flexible than model-free RL approaches,where the agent must be trained offline before making online decisions. Our proposal combinesmodel predictive control with a rollout strategy and Monte Carlo tree search. At each decisionstage, our algorithm anticipates the consequences of all the possible decisions in a number offuture stages (the lookahead horizon), and then uses a base policy to generate a sequence ofdecisions beyond the lookahead horizon. This base policy could be, for example, a set of decisionrules based on the experience and expertise of the transportation company covering the routes.Our numerical results show that the policy obtained using our algorithm outperforms not onlythe base policy (up to 83%), but also a policy obtained offline using deep Q networks (DQN),a state-of-the-art, model-free RL algorithm.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [41] Online Operational Decision-Making for Integrated Electric-Gas Systems With Safe Reinforcement Learning
    Sayed, Ahmed Rabee
    Zhang, Xian
    Wang, Guibin
    Qiu, Jing
    Wang, Cheng
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (02) : 2893 - 2906
  • [42] Episodic memory governs choices: An RNN-based reinforcement learning model for decision-making task
    Zhang, Xiaohan
    Liu, Lu
    Long, Guodong
    Jiang, Jing
    Liu, Shenquan
    NEURAL NETWORKS, 2021, 134 : 1 - 10
  • [43] Intelligent Driving Decision-Making Strategy for New Energy Vehicles Based on Lightweight Reinforcement Learning Model
    Li, Wen-Tao
    Zhang, Zhi
    Yi, Xiang-Yu
    Dong, Xiao-Bo
    Zhang, Liang-Gui
    Yang, Li-Yun
    Journal of Computers (Taiwan), 2024, 35 (05) : 237 - 252
  • [44] A Guide to an Iterative Approach to Model-Based Decision Making in Health and Medicine: An Iterative Decision-Making Framework
    Kunst, Natalia
    Burger, Emily A.
    Coupe, Veerle M. H.
    Kuntz, Karen M.
    Aas, Eline
    PHARMACOECONOMICS, 2024, 42 (04) : 363 - 371
  • [45] Intelligent vehicle driving decision-making model based on variational AutoEncoder network and deep reinforcement learning
    Wang, Shufeng
    Wang, Zhengli
    Wang, Xinkai
    Liang, Qingwei
    Meng, Lingyi
    Expert Systems with Applications, 2025, 268
  • [46] Model-based Bayesian Reinforcement Learning in Factored Markov Decision Process
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    JOURNAL OF COMPUTERS, 2014, 9 (04) : 845 - 850
  • [47] A Guide to an Iterative Approach to Model-Based Decision Making in Health and Medicine: An Iterative Decision-Making Framework
    Natalia Kunst
    Emily A. Burger
    Veerle M. H. Coupé
    Karen M. Kuntz
    Eline Aas
    PharmacoEconomics, 2024, 42 : 363 - 371
  • [48] MODEL-BASED METHOD FOR COMPUTER-AIDED MEDICAL DECISION-MAKING
    WEISS, SM
    KULIKOWSKI, CA
    AMAREL, S
    SAFIR, A
    ARTIFICIAL INTELLIGENCE, 1978, 11 (1-2) : 145 - 172
  • [49] From model-based perceptual decision-making to spatial interference control
    van Maanen, Leendert
    Turner, Brandon
    Forstmann, Birte U.
    CURRENT OPINION IN BEHAVIORAL SCIENCES, 2015, 1 : 72 - 77
  • [50] Optimization of Data Collection Strategies for Model-Based Evaluation and Decision-Making
    Cain, Robert
    van Moorsel, Aad
    2012 42ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2012,