Online model-based reinforcement learning for decision-making in long distance routes

被引:2
|
作者
Alcaraz, Juan J. [1 ]
Losilla, Fernando [1 ]
Caballero-Arnaldos, Luis [1 ]
机构
[1] Tech Univ Cartagena UPCT, Dept Informat & Commun Technol, Cartagena, Spain
关键词
Route scheduling; Reinforcement learning; Model predictive control; Monte Carlo tree search; VEHICLE-ROUTING PROBLEM; TIME WINDOWS; STOCHASTIC TRAVEL; OPTIMIZATION; FRAMEWORK; SERVICE;
D O I
10.1016/j.tre.2022.102790
中图分类号
F [经济];
学科分类号
02 ;
摘要
In road transportation, long-distance routes require scheduled driving times, breaks, and restperiods, in compliance with the regulations on working conditions for truck drivers, whileensuring goods are delivered within the time windows of each customer. However, routes aresubject to uncertain travel and service times, and incidents may cause additional delays, makingpredefined schedules ineffective in many real-life situations. This paper presents a reinforcementlearning (RL) algorithm capable of making en-route decisions regarding driving times, breaks,and rest periods, under uncertain conditions. Our proposal aims at maximizing the likelihood ofon-time delivery while complying with drivers' work regulations. We use an online model-basedRL strategy that needs no prior training and is more flexible than model-free RL approaches,where the agent must be trained offline before making online decisions. Our proposal combinesmodel predictive control with a rollout strategy and Monte Carlo tree search. At each decisionstage, our algorithm anticipates the consequences of all the possible decisions in a number offuture stages (the lookahead horizon), and then uses a base policy to generate a sequence ofdecisions beyond the lookahead horizon. This base policy could be, for example, a set of decisionrules based on the experience and expertise of the transportation company covering the routes.Our numerical results show that the policy obtained using our algorithm outperforms not onlythe base policy (up to 83%), but also a policy obtained offline using deep Q networks (DQN),a state-of-the-art, model-free RL algorithm.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [31] Review of Autonomous Driving Decision-Making Research Based on Reinforcement Learning
    Jin L.
    Han G.
    Xie X.
    Guo B.
    Liu G.
    Zhu W.
    Qiche Gongcheng/Automotive Engineering, 2023, 45 (04): : 527 - 540
  • [32] A Decision-Making Method of Intelligent Distance Online Education Based on Cloud Computing
    Jun-yan Tong
    Gautam Srivastava
    Mobile Networks and Applications, 2022, 27 : 1151 - 1161
  • [33] A Decision-Making Method of Intelligent Distance Online Education Based on Cloud Computing
    Tong, Jun-yan
    Srivastava, Gautam
    MOBILE NETWORKS & APPLICATIONS, 2022, 27 (03): : 1151 - 1161
  • [34] Offline Model-Based Adaptable Policy Learning for Decision-Making in Out-of-Support Regions
    Chen, Xiong-Hui
    Luo, Fan-Ming
    Yu, Yang
    Li, Qingyang
    Qin, Zhiwei
    Shang, Wenjie
    Ye, Jieping
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15260 - 15274
  • [35] Quantum reinforcement learning during human decision-making
    Ji-An Li
    Daoyi Dong
    Zhengde Wei
    Ying Liu
    Yu Pan
    Franco Nori
    Xiaochu Zhang
    Nature Human Behaviour, 2020, 4 : 294 - 307
  • [36] Quantum reinforcement learning during human decision-making
    Li, Ji-An
    Dong, Daoyi
    Wei, Zhengde
    Liu, Ying
    Pan, Yu
    Nori, Franco
    Zhang, Xiaochu
    NATURE HUMAN BEHAVIOUR, 2020, 4 (03) : 294 - 307
  • [37] Reinforcement learning for decision-making under deep uncertainty
    Pei, Zhihao
    Rojas-Arevalo, Angela M.
    de Haan, Fjalar J.
    Lipovetzky, Nir
    Moallemi, Enayat A.
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 359
  • [38] Application of Reinforcement Learning in Multiagent Intelligent Decision-Making
    Han, Xiaoyu
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [39] Model-Based Reinforcement Learning Framework of Online Network Resource Allocation
    Bakhshi, Bahador
    Mangues-Bafalluy, Josep
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 4456 - 4461
  • [40] Efficient model-based reinforcement learning for approximate online optimal control
    Kamalapurkar, Rushikesh
    Rosenfeld, Joel A.
    Dixon, Warren E.
    AUTOMATICA, 2016, 74 : 247 - 258