Hyper-heuristic for CVRP with reinforcement learning

被引:0
|
作者
Zhang J. [1 ]
Feng Q. [1 ]
Zhao Y. [1 ]
Liu J. [1 ]
Leng L. [1 ]
机构
[1] Key Laboratory of Special Equipment Manufacturing and Advanced Processing Technology, Ministry of Education, Zhejiang University of Technology, Hangzhou
关键词
Deep Q neural network; Hyper-heuristic algorithm; Reinforcement learning; Vehicle routing problem;
D O I
10.13196/j.cims.2020.04.025
中图分类号
学科分类号
摘要
To reduce the situation of falling into local optimum and solve the capacitated vehicle routing problem, a hyper-heuristic algorithm based on reinforcement learning was. A high-level heuristic strategy was designed, which included selection strategy and acceptance criteria. Based on the learning mechanism, the deep Q neural network algorithm in reinforcement learning was used to construct the selection strategy, and evaluate the performance of the underlying operator with rewards and punishments; Rewards and punishments as well as simulated annealing was used as the acceptance criteria, and a sequence pool was constructed for high-quality solutions, so as to guide the algorithm searching effectively. Also, the clustering method was used to improve the quality of the initial solution. The optimal value was analyzed, error rate and average value were compared with other algorithms. The experimental results show that the proposed algorithm was effect and stable in solving the problem, and the overall solution effect was better than the comparison algorithm. © 2020, Editorial Department of CIMS. All right reserved.
引用
收藏
页码:1118 / 1129
页数:11
相关论文
共 32 条
  • [1] Dantzig G.B., Ramser J.H., The truck dispatching problem, Management Science, 6, 1, pp. 80-91, (1959)
  • [2] Fischetti M., Toth P., Vigo D., A branch-and-bound algorithm for the capacitated vehicle routing problem on directed graphs, Operations Research, 42, 5, pp. 846-859, (1994)
  • [3] Fisher M.L., Optimal solution of vehicle routing problems using minimum K-trees, Operations Research, 42, 4, pp. 626-642, (1994)
  • [4] Christofides N., Mingozzi A., Toth P., Space state relaxation procedures for the computation of bounds to routing problems, Networks, 11, 2, pp. 145-164, (1981)
  • [5] Clarke G., Wright J.W., Scheduling of vehicles from a central depot to a number of delivery points, Operations Research, 12, 4, pp. 568-581, (1964)
  • [6] Bramel J., Simchi-Levi D., A location based heuristic for general routing problems, Operations Research, 43, 4, pp. 649-660, (1995)
  • [7] Osman I.H., Metastrategy simulated annealing and tabu search algorithms for the vehicle routing problem, Annals of Operations Research, 41, 4, pp. 421-451, (1993)
  • [8] Bullnheimer B., Hartl R.F., Strauss C., An improved ant system algorithm for the vehicle routing problem, Annals of Operations Research, 89, 1, pp. 319-328, (1999)
  • [9] Barker B.M., Ayechew M.A., A genetic algorithm for the vehicle routing problem, Computers & Operations Research, 30, 5, pp. 787-800, (2003)
  • [10] Marinakis Y., Iordanidou G., Marinaki M., Particle swarm optimization for the vehicle routing problem with stochastic demands, Applied Soft Computing, 13, 4, pp. 1693-1704, (2013)