Combining deep reinforcement learning with heuristics to solve the traveling salesman problem

被引:0
|
作者
Hong, Li [1 ]
Liu, Yu [1 ]
Xu, Mengqiao [2 ]
Deng, Wenhui [2 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Dalian1 16620, Peoples R China
[2] Dalian Univ Technol, Sch Econ & Management, Dalian 116024, Peoples R China
基金
中国国家自然科学基金;
关键词
traveling salesman problem; deep reinforcement learning; simulated annealing algorithm; transformer model; whale optimization algorithm; 87.55.kd; 87.55.de; 07.05.Mh; ALGORITHM;
D O I
10.1088/1674-1056/ad95f1
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Recent studies employing deep learning to solve the traveling salesman problem (TSP) have mainly focused on learning construction heuristics. Such methods can improve TSP solutions, but still depend on additional programs. However, methods that focus on learning improvement heuristics to iteratively refine solutions remain insufficient. Traditional improvement heuristics are guided by a manually designed search strategy and may only achieve limited improvements. This paper proposes a novel framework for learning improvement heuristics, which automatically discovers better improvement policies for heuristics to iteratively solve the TSP. Our framework first designs a new architecture based on a transformer model to make the policy network parameterized, which introduces an action-dropout layer to prevent action selection from overfitting. It then proposes a deep reinforcement learning approach integrating a simulated annealing mechanism (named RL-SA) to learn the pairwise selected policy, aiming to improve the 2-opt algorithm's performance. The RL-SA leverages the whale optimization algorithm to generate initial solutions for better sampling efficiency and uses the Gaussian perturbation strategy to tackle the sparse reward problem of reinforcement learning. The experiment results show that the proposed approach is significantly superior to the state-of-the-art learning-based methods, and further reduces the gap between learning-based methods and highly optimized solvers in the benchmark datasets. Moreover, our pre-trained model M can be applied to guide the SA algorithm (named M-SA (ours)), which performs better than existing deep models in small-, medium-, and large-scale TSPLIB datasets. Additionally, the M-SA (ours) achieves excellent generalization performance in a real-world dataset on global liner shipping routes, with the optimization percentages in distance reduction ranging from 3.52% to 17.99%.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Combining deep reinforcement learning with heuristics to solve the traveling salesman problem
    洪莉
    刘宇
    徐梦俏
    邓文慧
    Chinese Physics B, 2025, 34 (01) : 100 - 110
  • [2] Deep reinforcement learning combined with transformer to solve the traveling salesman problem
    Liu, Chang
    Feng, Xue-Feng
    Li, Feng
    Xian, Qing-Long
    Jia, Zhen-Hong
    Wang, Yu-Hang
    Du, Zong-Dong
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [3] Combining reinforcement learning algorithm and genetic algorithm to solve the traveling salesman problem
    Ruan, Yaqi
    Cai, Weihong
    Wang, Jiaying
    JOURNAL OF ENGINEERING-JOE, 2024, 2024 (06):
  • [4] Applying Deep Learning and Reinforcement Learning to Traveling Salesman Problem
    Miki, Shoma
    Yamamoto, Daisuke
    Ebara, Hiroyuki
    2018 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRONICS & COMMUNICATIONS ENGINEERING (ICCECE), 2018, : 65 - 70
  • [5] A new multiagent reinforcement learning algorithm to solve the symmetric traveling salesman problem
    Alipour, Mir Mohammad
    Razavi, Seyed Naser
    MULTIAGENT AND GRID SYSTEMS, 2015, 11 (02) : 107 - 119
  • [6] A deep reinforcement learning approach for solving the Traveling Salesman Problem with Drone
    Bogyrbayeva, Aigerim
    Yoon, Taehyun
    Ko, Hanbum
    Lim, Sungbin
    Yun, Hyokun
    Kwon, Changhyun
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2023, 148
  • [7] Deep Reinforcement Learning for Traveling Salesman Problem with Time Windows and Rejections
    Zhang, Rongkai
    Prokhorchuk, Anatolii
    Dauwels, Justin
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [8] Reinforcement learning for the traveling salesman problem with refueling
    André L. C. Ottoni
    Erivelton G. Nepomuceno
    Marcos S. de Oliveira
    Daniela C. R. de Oliveira
    Complex & Intelligent Systems, 2022, 8 : 2001 - 2015
  • [9] Reinforcement learning for the traveling salesman problem with refueling
    Ottoni, Andre L. C.
    Nepomuceno, Erivelton G.
    Oliveira, Marcos S. de
    Oliveira, Daniela C. R. de
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (03) : 2001 - 2015
  • [10] Combining Reinforcement Learning with Lin-Kernighan-Helsgaun Algorithm for the Traveling Salesman Problem
    Zheng, Jiongzhi
    He, Kun
    Zhou, Jianrong
    Jin, Yan
    Li, Chu-Min
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 12445 - 12452