A Graph Pointer Network-Based Multi-Objective Deep Reinforcement Learning Algorithm for Solving the Traveling Salesman Problem

被引:12
|
作者
Perera, Jeewaka [1 ,2 ]
Liu, Shih-Hsi [1 ]
Mernik, Marjan [3 ]
Crepinsek, Matej [3 ]
Ravber, Miha [3 ]
机构
[1] Calif State Univ Fresno, Dept Comp Sci, Fresno, CA 93740 USA
[2] Sri Lanka Inst Informat Technol, Fac Comp, Malabe 10115, Sri Lanka
[3] Univ Maribor, Fac Elect Engn & Comp Sci, Koroska Cesta 46, Maribor Maribor State 2000, Slovenia
关键词
multi-objective optimization; traveling salesman problems; deep reinforcement learning; EVOLUTIONARY ALGORITHMS; INDICATORS;
D O I
10.3390/math11020437
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Traveling Salesman Problems (TSPs) have been a long-lasting interesting challenge to researchers in different areas. The difficulty of such problems scales up further when multiple objectives are considered concurrently. Plenty of work in evolutionary algorithms has been introduced to solve multi-objective TSPs with promising results, and the work in deep learning and reinforcement learning has been surging. This paper introduces a multi-objective deep graph pointer network-based reinforcement learning (MODGRL) algorithm for multi-objective TSPs. The MODGRL improves an earlier multi-objective deep reinforcement learning algorithm, called DRL-MOA, by utilizing a graph pointer network to learn the graphical structures of TSPs. Such improvements allow MODGRL to be trained on a small-scale TSP, but can find optimal solutions for large scale TSPs. NSGA-II, MOEA/D and SPEA2 are selected to compare with MODGRL and DRL-MOA. Hypervolume, spread and coverage over Pareto front (CPF) quality indicators were selected to assess the algorithms' performance. In terms of the hypervolume indicator that represents the convergence and diversity of Pareto-frontiers, MODGRL outperformed all the competitors on the three well-known benchmark problems. Such findings proved that MODGRL, with the improved graph pointer network, indeed performed better, measured by the hypervolume indicator, than DRL-MOA and the three other evolutionary algorithms. MODGRL and DRL-MOA were comparable in the leading group, measured by the spread indicator. Although MODGRL performed better than DRL-MOA, both of them were just average regarding the evenness and diversity measured by the CPF indicator. Such findings remind that different performance indicators measure Pareto-frontiers from different perspectives. Choosing a well-accepted and suitable performance indicator to one's experimental design is very critical, and may affect the conclusions. Three evolutionary algorithms were also experimented on with extra iterations, to validate whether extra iterations affected the performance. The results show that NSGA-II and SPEA2 were greatly improved measured by the Spread and CPF indicators. Such findings raise fairness concerns on algorithm comparisons using different fixed stopping criteria for different algorithms, which appeared in the DRL-MOA work and many others. Through these lessons, we concluded that MODGRL indeed performed better than DRL-MOA in terms of hypervolumne, and we also urge researchers on fair experimental designs and comparisons, in order to derive scientifically sound conclusions.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] A deep reinforcement learning algorithm framework for solving multi-objective traveling salesman problem based on feature transformation
    Zhao, Shijie
    Gu, Shenshen
    [J]. NEURAL NETWORKS, 2024, 176
  • [2] Deep reinforcement learning for multi-objective combinatorial optimization: A case study on multi-objective traveling salesman problem
    Li, Shicheng
    Wang, Feng
    He, Qi
    Wang, Xujie
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2023, 83
  • [3] Temporal Fusion Pointer network-based Reinforcement Learning algorithm for Multi-Objective Workflow Scheduling in the cloud
    Wang, Binyang
    Li, Huifang
    Lin, Zhiwei
    Xia, Yuanqing
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [4] G-DGANet: Gated deep graph attention network with reinforcement learning for solving traveling salesman problem
    Fellek, Getu
    Farid, Ahmed
    Fujimura, Shigeru
    Yoshie, Osamu
    Gebreyesus, Goytom
    [J]. NEUROCOMPUTING, 2024, 579
  • [5] G-DGANet: Gated deep graph attention network with reinforcement learning for solving traveling salesman problem
    Fellek, Getu
    Farid, Ahmed
    Fujimura, Shigeru
    Yoshie, Osamu
    Gebreyesus, Goytom
    [J]. Neurocomputing, 2024, 579
  • [6] A Hybrid Estimation of Distribution Algorithm for Solving the Multi-objective Multiple Traveling Salesman Problem
    Shim, V. A.
    Tan, K. C.
    Tan, K. K.
    [J]. 2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2012,
  • [7] A deep reinforcement learning approach for solving the Traveling Salesman Problem with Drone
    Bogyrbayeva, Aigerim
    Yoon, Taehyun
    Ko, Hanbum
    Lim, Sungbin
    Yun, Hyokun
    Kwon, Changhyun
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2023, 148
  • [8] Perturbed Decomposition Algorithm applied to the multi-objective Traveling Salesman Problem
    Cornu, Marek
    Cazenave, Tristan
    Vanderpooten, Daniel
    [J]. COMPUTERS & OPERATIONS RESEARCH, 2017, 79 : 314 - 330
  • [9] Genetic Algorithm Based Multi-objective Optimization Framework to Solve Traveling Salesman Problem
    George, Tintu
    Amudha, T.
    [J]. ADVANCES IN COMPUTING AND INTELLIGENT SYSTEMS, ICACM 2019, 2020, : 141 - 151
  • [10] Multi-objective chemical reaction optimization based decomposition for multi-objective traveling salesman problem
    Bouzoubia, Samira
    Layeb, Abdesslem
    Chikhi, Salim
    [J]. PROCEEDINGS OF 2015 THIRD IEEE WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS), 2015,