Exploring Deep Reinforcement Learning for Task Dispatching in Autonomous On-Demand Services

被引：4

作者：

Yang, Lei ^{[1
]}

Yu, Xi ^{[1
]}

Cao, Jiannong ^{[2
]}

Liu, Xuxun ^{[3
]}

Zhou, Pan ^{[4
]}

机构：

[1] South China Univ Technol, Sch Software Engn, 382 Waihuandong Rd, Guangzhou 510006, Guangdong, Peoples R China

[2] Hong Kong Polytech Univ, Dept Comp, Kowloon, 11 Yucai Rd, Hong Kong, Peoples R China

[3] South China Univ Technol, Sch Elect & Informat Engn, 382 Waihuandong Rd, Guangzhou 510006, Guangdong, Peoples R China

[4] Huazhong Univ Sci & Technol, Hubei Engn Res Ctr Big Data Secur, Sch Cyber Sci & Engn, 1037 Luoyu Rd, Wuhan 430074, Hubei, Peoples R China

来源：

ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA | 2021年 / 15卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Demand dispatching; on-demand services; deep reinforcement learning; ASSIGNMENT;

D O I：

10.1145/3442343

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Autonomous on-demand services, such as GOGOX (formerly GoGoVan) in Hong Kong, provide a platform for users to request services and for suppliers to meet such demands. In such a platform, the suppliers have autonomy to accept or reject the demands to be dispatched to him/her, so it is challenging to make an online matching between demands and suppliers. Existing methods use round-based approaches to dispatch demands. In these works, the dispatching decision is based on the predicted response patterns of suppliers to demands in the current round, but they all fail to consider the impact of future demands and suppliers on the current dispatching decision. This could lead to taking a suboptimal dispatching decision from the future perspective. To solve this problem, we propose a novel demand dispatching model using deep reinforcement learning. In this model, we make each demand as an agent. The action of each agent, i.e., the dispatching decision of each demand, is determined by a centralized algorithm in a coordinated way. The model works in the following two steps. (1) It learns the demand's expected value in each spatiotemporal state using historical transition data. (2) Based on the learned values, it conducts a Many-To-Many dispatching using a combinatorial optimization algorithm by considering both immediate rewards and expected values of demands in the next round. In order to get a higher total reward, the demands with a high expected value (short response time) in the future may be delayed to the next round. On the contrary, the demands with a low expected value (long response time) in the future would be dispatched immediately. Through extensive experiments using real-world datasets, we show that the proposed model outperforms the existing models in terms of Cancellation Rate and Average Response Time.

引用

页数：23

共 50 条

[41] Controlling an Autonomous Vehicle with Deep Reinforcement Learning
Folkers, Andreas
Rick, Matthias
Bueskens, Christof
[J]. 2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 2025 - 2031
[42] Autonomous Drone Racing with Deep Reinforcement Learning
Song, Yunlong
Steinweg, Mats
Kaufmann, Elia
Scaramuzza, Davide
[J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 1205 - 1212
[43] Deep Reinforcement Learning for Autonomous Driving: A Survey
Kiran, B. Ravi
Sobh, Ibrahim
Talpaert, Victor
Mannion, Patrick
Al Sallab, Ahmad A.
Yogamani, Senthil
Perez, Patrick
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (06) : 4909 - 4926
[44] Deep Reinforcement Learning for Autonomous Search and Rescue
Zuluaga, Juan Gonzalo Carcamo
Leidig, Jonathan P.
Trefftz, Christian
Wolffe, Greg
[J]. NAECON 2018 - IEEE NATIONAL AEROSPACE AND ELECTRONICS CONFERENCE, 2018, : 521 - 524
[45] Autonomous exploration through deep reinforcement learning
Yan, Xiangda
Huang, Jie
He, Keyan
Hong, Huajie
Xu, Dasheng
[J]. INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2023, 50 (05): : 793 - 803
[46] Autonomous drone interception with Deep Reinforcement Learning
Bertoin, David
Gauffriau, Adrien
Grasset, Damien
Gupta, Jayant Sen
[J]. CEUR Workshop Proceedings, 2022, 3173
[47] Collaborative on-demand dynamic deployment via deep reinforcement learning for IoV service in multi edge clouds
Huang, Yuze
Feng, Beipeng
Cao, Yuhui
Guo, Zhenzhen
Zhang, Miao
Zheng, Boren
[J]. JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
[48] Reinforcement learning based scheme for on-demand vehicular fog formation
Nsouli, Ahmad
El-Hajj, Wassim
Mourad, Azzam
[J]. VEHICULAR COMMUNICATIONS, 2023, 40
[49] Collaborative on-demand dynamic deployment via deep reinforcement learning for IoV service in multi edge clouds
Yuze Huang
Beipeng Feng
Yuhui Cao
Zhenzhen Guo
Miao Zhang
Boren Zheng
[J]. Journal of Cloud Computing, 12
[50] An intelligent open trading system for on-demand delivery facilitated by deep Q network based reinforcement learning
Guo, Chaojie
Zhang, Lele
Thompson, Russell G.
Foliente, Greg
Peng, Xiaoshuai
[J]. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024,

← 1 2 3 4 5 →