Online model-based reinforcement learning for decision-making in long distance routes

被引:2
|
作者
Alcaraz, Juan J. [1 ]
Losilla, Fernando [1 ]
Caballero-Arnaldos, Luis [1 ]
机构
[1] Tech Univ Cartagena UPCT, Dept Informat & Commun Technol, Cartagena, Spain
关键词
Route scheduling; Reinforcement learning; Model predictive control; Monte Carlo tree search; VEHICLE-ROUTING PROBLEM; TIME WINDOWS; STOCHASTIC TRAVEL; OPTIMIZATION; FRAMEWORK; SERVICE;
D O I
10.1016/j.tre.2022.102790
中图分类号
F [经济];
学科分类号
02 ;
摘要
In road transportation, long-distance routes require scheduled driving times, breaks, and restperiods, in compliance with the regulations on working conditions for truck drivers, whileensuring goods are delivered within the time windows of each customer. However, routes aresubject to uncertain travel and service times, and incidents may cause additional delays, makingpredefined schedules ineffective in many real-life situations. This paper presents a reinforcementlearning (RL) algorithm capable of making en-route decisions regarding driving times, breaks,and rest periods, under uncertain conditions. Our proposal aims at maximizing the likelihood ofon-time delivery while complying with drivers' work regulations. We use an online model-basedRL strategy that needs no prior training and is more flexible than model-free RL approaches,where the agent must be trained offline before making online decisions. Our proposal combinesmodel predictive control with a rollout strategy and Monte Carlo tree search. At each decisionstage, our algorithm anticipates the consequences of all the possible decisions in a number offuture stages (the lookahead horizon), and then uses a base policy to generate a sequence ofdecisions beyond the lookahead horizon. This base policy could be, for example, a set of decisionrules based on the experience and expertise of the transportation company covering the routes.Our numerical results show that the policy obtained using our algorithm outperforms not onlythe base policy (up to 83%), but also a policy obtained offline using deep Q networks (DQN),a state-of-the-art, model-free RL algorithm.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Online Constrained Model-based Reinforcement Learning
    van Niekerk, Benjamin
    Damianou, Andreas
    Rosman, Benjamin
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI2017), 2017,
  • [2] Decision-making in a model-based design process
    Schade, Jutta
    Olofsson, Thomas
    Schreyer, Marcus
    [J]. CONSTRUCTION MANAGEMENT AND ECONOMICS, 2011, 29 (04) : 371 - 382
  • [3] Reduced Model-Based Decision-Making in Schizophrenia
    Culbreth, Adam J.
    Westbrook, Andrew
    Daw, Nathaniel D.
    Botvinick, Matthew
    Barch, Deanna M.
    [J]. JOURNAL OF ABNORMAL PSYCHOLOGY, 2016, 125 (06) : 777 - 787
  • [4] A Decision-Making Model for Autonomous Vehicles at Intersections Based on Hierarchical Reinforcement Learning
    Chen, Xue-Mei
    Xu, Shu-Yuan
    Wang, Zi-Jia
    Zheng, Xue-Long
    Han, Xin-Tong
    Liu, En-Hao
    [J]. UNMANNED SYSTEMS, 2024, 12 (04) : 641 - 652
  • [5] Reinforcement learning with hierarchical decision-making
    Cohen, Shahar
    Maimon, Oded
    Khmlenitsky, Evgeni
    [J]. ISDA 2006: SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 3, 2006, : 177 - +
  • [6] Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning
    Supancic, James, III
    Ramanan, Deva
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 322 - 331
  • [7] Reduced model-based decision-making in gambling disorder
    Wyckmans, Florent
    Otto, A. Ross
    Sebold, Miriam
    Daw, Nathaniel
    Bechara, Antoine
    Saeremans, Melanie
    Kornreich, Charles
    Chatard, Armand
    Jaafari, Nemat
    Noel, Xavier
    [J]. SCIENTIFIC REPORTS, 2019, 9 (1)
  • [8] Reduced model-based decision-making in gambling disorder
    Florent Wyckmans
    A. Ross Otto
    Miriam Sebold
    Nathaniel Daw
    Antoine Bechara
    Mélanie Saeremans
    Charles Kornreich
    Armand Chatard
    Nemat Jaafari
    Xavier Noël
    [J]. Scientific Reports, 9
  • [9] Generative Model-Based Testing on Decision-Making Policies
    Li, Zhuo
    Wu, Xiongfei
    Zhu, Derui
    Cheng, Mingfei
    Chen, Siyuan
    Zhang, Fuyuan
    Xie, Xiaofei
    Ma, Lei
    Zhao, Jianjun
    [J]. 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 243 - 254
  • [10] Research on Decision-Making in Emotional Agent Based on Reinforcement Learning
    Feng Chao
    Chen Lin
    Jiang Kui
    Wei Zhonglin
    Zhai Bing
    [J]. 2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1191 - 1194