GE-DDRL: Graph Embedding and Deep Distributional Reinforcement Learning for Reliable Shortest Path: A Universal and Scale Free Solution

被引：2

作者：

Guo, Hongliang ^{[1
]}

Sheng, Wenda ^{[2
]}

Zhou, Yingjie ^{[1
]}

Chen, Yunping ^{[2
]}

机构：

[1] Sichuan Univ, Coll Comp Sci, Chengdu 610017, Peoples R China

[2] Univ Elect Sci & Technol China, Sch Automat Engn, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2023年 / 24卷 / 11期

关键词：

~Reliable shortest path (RSP); distributional reinforcement learning (DRL); universal and scale free solution; graph embedding and deep distributional reinforcement learning (GE-DDRL); skip-gram; TRAVEL-TIME; STOCHASTIC NETWORKS; ALGORITHM;

D O I：

10.1109/TITS.2023.3285770

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

This paper studies the reliable shortest path (RSP) problem in stochastic transportation networks. State-of-the-art RSP solutions usually target one specific RSP problem; moreover, the corresponding algorithm's computational complexity scales at least linearly with the size of the underlying transportation network. While in this paper, we propose a graph embedding and deep distributional reinforcement learning (GE-DDRL) method, which serves as a universal and scale-free solution to the RSP problem. GE-DDRL uses deep distributional reinforcement learning (DDRL) to estimate the full travel-time distribution of a given routing policy, and improves the given routing policy with the generalized policy iteration (GPI) scheme. Further, in order to achieve the generalization ability to new destination nodes, we employ one of the canonical graph embedding techniques (Skip-Gram) to compress the nodes' representation into d-dimensional real-valued vectors. With the properly compressed node features, GE-DDRL is able to generalize its estimation of the routing policy's travel-time distribution to untrained destination nodes, and hence achieve the 'all-to-all' navigation functionality. To the best of our knowledge, GE-DDRL serves as the first RSP planner, which applies simultaneously to almost all RSP objectives and in the meanwhile, is scale free with the size of the transportation network in terms of the online decision-making time and memory complexity. Experimental results and comparisons with state of the arts show the efficacy and efficiency of GE-DDRL in a range of transportation networks.

引用

页码：12196 / 12214

页数：19

共 4 条

[1] DRL Router: Distributional Reinforcement Learning-Based Router for Reliable Shortest Path Problems
Guo, Hongliang
Sheng, Wenda
Gao, Chen
Jin, Yaochu
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE, 2023, 15 (05) : 91 - 108
[2] Significant Sampling for Shortest Path Routing: A Deep Reinforcement Learning Solution
Shao, Yulin
Rezaee, Arman
Liew, Soung Chang
Chan, Vincent W. S.
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2020, 38 (10) : 2234 - 2248
[3] Significant Sampling for Shortest Path Routing: A Deep Reinforcement Learning Solution
Shao, Yulin
Rezaee, Arman
Liew, Soung Chang
Chan, Vincent W. S.
2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
[4] Deep Reinforcement Learning with Graph Neural Networks for Capacitated Shortest Path Tour based Service Chaining
Hara, Takanori
Sasabe, Masahiro
2022 18TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM 2022): INTELLIGENT MANAGEMENT OF DISRUPTIVE NETWORK TECHNOLOGIES AND SERVICES, 2022, : 19 - 27

← 1 →