Multi-objective reinforcement learning for bi-objective time-dependent pickup and delivery problem with late penalties

被引:2
|
作者
Santiyuda, Gemilang [1 ]
Wardoyo, Retantyo [1 ]
Pulungan, Reza [1 ]
Yu, Vincent F. [2 ]
机构
[1] Univ Gadjah Mada, Fac Math & Nat Sci, Dept Comp Sci & Elect, Yogyakarta, Indonesia
[2] Natl Taiwan Univ Sci & Technol, Dept Ind Management, Taipei, Taiwan
关键词
Pickup and delivery problem; Multi-objective; Deep reinforcement learning; Attention mechanism; Hypernetwork; VEHICLE-ROUTING PROBLEM; OPTIMIZATION; EVOLUTIONARY; ALGORITHM; REPRESENTATIONS; NETWORK; MOEA/D; BRANCH; PRICE;
D O I
10.1016/j.engappai.2023.107381
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study addresses the bi-objective time-dependent pickup and delivery problem with late penalties (TDPDPLP). Incorporating time-dependent travel time into the problem formulation to model traffic congestion is critical, especially for problems with time-related costs, to decrease the difference in the projected quality of solutions when applying optimization methods in the real world. This study proposes a multi-objective reinforcement learning (MORL)-based method with hypernetwork and heterogeneous attention mechanism (HAM) with a two-stage training scheme to solve the bi-objective TDPDPLP. The proposed method can instantly generate an approximation of the Pareto optimal front (POF) after offline training. The conducted ablation study also showed that discarding coordinates from the features simplifies the model and saves several hours of training while improving the quality of the solutions. The performance of the trained model is evaluated on various instances, including real-world-based instances from Barcelona, Berlin, and Porto Alegre. The performance of the proposed method is evaluated based on the hypervolume (HV) and additive epsilon (epsilon+) of the generated POF. We compare the performance of the proposed method to another MORL method, namely the preference-conditioned multi-objective combinatorial optimization (PMOCO) and several well-known multiobjective evolutionary algorithms (MOEAs). Experiments showed that the proposed method performs better than PMOCO and the employed MOEAs on various problem instances. The trained method only needs minutes to generate a POF approximation, while the MOEA(s) require hours. Furthermore, it also generalizes well on different characteristics of problem instances and performs well on instances from cities other than the city in the training instances.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] A multi-objective parallel variable neighborhood search for the bi-objective obnoxious p-median problem
    Sanchez-Oro, Jesus
    Lopez-Sanchez, Ana D.
    Colmenar, J. Manuel
    OPTIMIZATION LETTERS, 2022, 16 (01) : 301 - 331
  • [42] A multi-objective parallel variable neighborhood search for the bi-objective obnoxious p-median problem
    Jesús Sánchez-Oro
    Ana D. López-Sánchez
    J. Manuel Colmenar
    Optimization Letters, 2022, 16 : 301 - 331
  • [43] Solving Bi-Objective Unconstrained Binary Quadratic Programming Problem with Multi-Objective Path Relinking Algorithm
    Song, Lei
    Zeng, Rong-Qiang
    Wang, Yang
    Shang, Ming-Sheng
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 289 - 293
  • [44] Multi-objective ω-Regular Reinforcement Learning
    Hahn, Ernst Moritz
    Perez, Mateo
    Schewe, Sven
    Somenzi, Fabio
    Trivedi, Ashutosh
    Wojtczak, Dominik
    FORMAL ASPECTS OF COMPUTING, 2023, 35 (02)
  • [45] Federated multi-objective reinforcement learning
    Zhao, Fangyuan
    Ren, Xuebin
    Yang, Shusen
    Zhao, Peng
    Zhang, Rui
    Xu, Xinxin
    INFORMATION SCIENCES, 2023, 624 : 811 - 832
  • [46] Multi-Objective Optimisation by Reinforcement Learning
    Liao, H. L.
    Wu, Q. H.
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [47] Deep reinforcement learning for multi-objective combinatorial optimization: A case study on multi-objective traveling salesman problem
    Li, Shicheng
    Wang, Feng
    He, Qi
    Wang, Xujie
    SWARM AND EVOLUTIONARY COMPUTATION, 2023, 83
  • [48] Bi-objective Portfolio Optimization Using Archive Multi-objective Simulated Annealing
    Sen, Tanmay
    Saha, Sriparna
    Ekbal, Asif
    Laha, Amab Kumar
    2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [49] A bi-objective mathematical model for two-dimensional loading time-dependent vehicle routing problem
    Alinaghian, Mahdi
    Zamanlou, Komail
    Sabbagh, Mohammad S.
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2017, 68 (11) : 1422 - 1441
  • [50] A bi-objective optimization model for the medical supplies' simultaneous pickup and delivery with drones
    Shi, Yuhe
    Lin, Yun
    Li, Bo
    Li, Rita Yi Man
    COMPUTERS & INDUSTRIAL ENGINEERING, 2022, 171