Multi-objective reinforcement learning for bi-objective time-dependent pickup and delivery problem with late penalties

被引:2
|
作者
Santiyuda, Gemilang [1 ]
Wardoyo, Retantyo [1 ]
Pulungan, Reza [1 ]
Yu, Vincent F. [2 ]
机构
[1] Univ Gadjah Mada, Fac Math & Nat Sci, Dept Comp Sci & Elect, Yogyakarta, Indonesia
[2] Natl Taiwan Univ Sci & Technol, Dept Ind Management, Taipei, Taiwan
关键词
Pickup and delivery problem; Multi-objective; Deep reinforcement learning; Attention mechanism; Hypernetwork; VEHICLE-ROUTING PROBLEM; OPTIMIZATION; EVOLUTIONARY; ALGORITHM; REPRESENTATIONS; NETWORK; MOEA/D; BRANCH; PRICE;
D O I
10.1016/j.engappai.2023.107381
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study addresses the bi-objective time-dependent pickup and delivery problem with late penalties (TDPDPLP). Incorporating time-dependent travel time into the problem formulation to model traffic congestion is critical, especially for problems with time-related costs, to decrease the difference in the projected quality of solutions when applying optimization methods in the real world. This study proposes a multi-objective reinforcement learning (MORL)-based method with hypernetwork and heterogeneous attention mechanism (HAM) with a two-stage training scheme to solve the bi-objective TDPDPLP. The proposed method can instantly generate an approximation of the Pareto optimal front (POF) after offline training. The conducted ablation study also showed that discarding coordinates from the features simplifies the model and saves several hours of training while improving the quality of the solutions. The performance of the trained model is evaluated on various instances, including real-world-based instances from Barcelona, Berlin, and Porto Alegre. The performance of the proposed method is evaluated based on the hypervolume (HV) and additive epsilon (epsilon+) of the generated POF. We compare the performance of the proposed method to another MORL method, namely the preference-conditioned multi-objective combinatorial optimization (PMOCO) and several well-known multiobjective evolutionary algorithms (MOEAs). Experiments showed that the proposed method performs better than PMOCO and the employed MOEAs on various problem instances. The trained method only needs minutes to generate a POF approximation, while the MOEA(s) require hours. Furthermore, it also generalizes well on different characteristics of problem instances and performs well on instances from cities other than the city in the training instances.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] A bi-objective time-dependent vehicle routing problem with delivery failure probabilities
    Menares, Franco
    Montero, Elizabeth
    Paredes-Belmar, German
    Bronfman, Andres
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 185
  • [2] Mono-objective and multi-objective models for the pickup and delivery problem with time windows
    Askri, Ahlem
    Ben Yahia, Sadok
    Rached, Mansour
    [J]. PROCEEDINGS OF THE 2015 IEEE 19TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2015, : 103 - 108
  • [3] Time-dependent and bi-objective vehicle routing problem with time windows
    Zhao, P. X.
    Luo, W. H.
    Han, X.
    [J]. ADVANCES IN PRODUCTION ENGINEERING & MANAGEMENT, 2019, 14 (02): : 201 - 212
  • [4] An Evolutionary Approach to the Multi-objective Pickup and Delivery Problem with Time Windows
    Garcia-Najera, Abel
    Angel Gutierrez-Andrade, Miguel
    [J]. 2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2013, : 997 - 1004
  • [5] The Multi-Objective Multi-Vehicle Pickup and Delivery Problem with Time Windows
    Grandinetti, L.
    Guerriero, F.
    Pezzella, F.
    Pisacane, O.
    [J]. TRANSPORTATION: CAN WE DO MORE WITH LESS RESOURCES? - 16TH MEETING OF THE EURO WORKING GROUP ON TRANSPORTATION - PORTO 2013, 2014, 111 : 203 - 212
  • [6] A greedy based algorithm for a bi-objective Pickup and Delivery Problem with Transfers
    Godart, Alexis
    Manier, Hervve
    Bloch, Christelle
    Manier, Marie-Ange
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 3229 - 3234
  • [7] Multi-objective QUBO Solver: Bi-objective Quadratic Assignment Problem
    Ayodele, Mayowa
    Allmendinger, Richard
    Lopez-Ibanez, Manuel
    Parizy, Matthieu
    [J]. PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 467 - 475
  • [8] A multi-objective electromagnetism algorithm for a bi-objective flowshop scheduling problem
    Khalili, Majid
    Tavakkoli-Moghaddam, Reza
    [J]. JOURNAL OF MANUFACTURING SYSTEMS, 2012, 31 (02) : 232 - 239
  • [9] A BI-OBJECTIVE REDUNDANCY ALLOCATION PROBLEM WITH TIME-DEPENDENT FAILURE RATES
    Sharifi, Mani
    Shojaie, Arefe
    Naserkhaki, Sajjad
    Shahriari, Mohammadreza
    [J]. INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING-THEORY APPLICATIONS AND PRACTICE, 2018, 25 (04): : 441 - 458
  • [10] Multi-objective learning automata: An approach for designing bi-objective classifier
    Zahiri, Seyed-Hamid
    [J]. Iranian Journal of Electrical and Computer Engineering, 2010, 9 (02): : 81 - 91