Learning to Escape: Multi-mode Policy Learning for the Traveling Salesmen Problem

被引：0

作者：

Ha, Myoung Hoon ^{[1
]}

Chi, Seunggeun ^{[2
]}

Lee, Sang Wan ^{[3
]}

机构：

[1] Korea Adv Inst Sci & Technol, Ctr Neurosci Inspired AI, Daejeon, South Korea

[2] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA

[3] Korea Adv Inst Sci & Technol, Dept Brain Cognit Sci, Daejeon, South Korea

来源：

IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS 2024, IEEE EAIS 2024 | 2024年

关键词：

Traveling Salesmen Problem; Neural Combinatoric Optimization; Deep Reinforcement Learning; Transformer;

D O I：

10.1109/EAIS58494.2024.10569999

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The traveling salesmen problem (TSP)-one of the most fundamental NP-hard problems in combinatorial optimization-has received considerable attention owing to its direct applicability to real-world routing. Recent studies on TSP have adopted a deep policy network to learn a stochastic acceptance rule. Despite its success in some cases, the structural and functional complexity of the deep policy networks makes it hard to explore the problem space while performing a local search at the same time. We found in our empirical analyses that searching processes are often stuck in the local region, leading to severe performance degradation. To tackle this issue, we propose a novel method for multi-mode policy learning. In the proposed method, a conventional exploration-exploitation scheme is reformulated as the problem of learning to escape from a local search area to induce exploration. We present a multi-mode Markov decision process, followed by policy and value design for local search and escaping modes. Experimental results show that the performance of the proposed method is superior to that of various baseline models, suggesting that the learned escaping policy allows the model to initiate a new local search in promising regions efficiently.

引用

页码：107 / 117

页数：11

共 50 条

[21] Separation of multi-mode surface waves by supervised machine learning methods
Li, Jing
Chen, Yuqing
Schuster, Gerard T.
GEOPHYSICAL PROSPECTING, 2020, 68 (04) : 1270 - 1280
[22] AMARL: An Attention-Based Multiagent Reinforcement Learning Approach to the Min-Max Multiple Traveling Salesmen Problem
Gao, Hao
Zhou, Xing
Xu, Xin
Lan, Yixing
Xiao, Yongqian
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 9758 - 9772
[23] Evolutionary Algorithm for the k-Interconnected Multi-Depot Multi-Traveling Salesmen Problem
Andrade, Carlos E.
Miyazawa, Flavio K.
Resende, Mauricio G. C.
GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 463 - 470
[24] The Machine Learning and Traveling Repairman Problem
Tulabandhula, Theja
Rudin, Cynthia
Jaillet, Patrick
ALGORITHMIC DECISION THEORY, 2011, 6992 : 262 - 276
[25] Mirror Adaptive Impedance Control of Multi-Mode Soft Exoskeleton With Reinforcement Learning
Xu, Jiajun
Huang, Kaizhen
Zhang, Tianyi
Zhao, Mengcheng
Ji, Aihong
Li, Youfu
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 : 6773 - 6785
[26] Multi-mode Light: Learning Special Collaboration Patterns for Traffic Signal Control
Chen, Zhi
Zhao, Shengjie
Deng, Hao
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 63 - 74
[27] Adaptive Multi-Mode Routing Algorithm for FANET Based on Deep Reinforcement Learning
Huang, Kai
Qiu, Xiulin
Yin, Jun
Yang, Yuwang
Computer Engineering and Applications, 2023, 59 (14) : 268 - 274
[28] Construction of Multi-mode Affective Learning System: Taking Affective Design as an Example
Lin, Hao-Chiang Koong
Su, Sheng-Hsiung
Chao, Ching-Ju
Hsieh, Cheng-Yen
Tsai, Shang-Chin
EDUCATIONAL TECHNOLOGY & SOCIETY, 2016, 19 (02): : 132 - 147
[29] A data-driven meta-learning recommendation model for multi-mode resource constrained project scheduling problem
Chu, Xianghua
Li, Shuxiang
Gao, Fei
Cui, Can
Pfeiffer, Forest
Cui, Jianshuang
COMPUTERS & OPERATIONS RESEARCH, 2023, 157
[30] Batch active learning for time-series classification with multi-mode exploration
Lee, Sangho
Choi, Chihyeon
Do, Hyungrok
Son, Youngdoo
INFORMATION SCIENCES, 2025, 711

← 1 2 3 4 5 →