Graded-Q Reinforcement Learning with Information-Enhanced State Encoder for Hierarchical Collaborative Multi-Vehicle Pursuit

被引：2

作者：

Yang, Yiying ^{[1
]}

Li, Xinhang ^{[1
]}

Yuan, Zheng ^{[1
]}

Wang, Qinwen ^{[1
]}

Xu, Chen ^{[1
]}

Zhang, Lin ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing, Peoples R China

来源：

2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN | 2022年

基金：

中国国家自然科学基金;

关键词：

cooperative multi-agent reinforcement learning; hierarchical collaborative multi-vehicle pursuit; GQRL-IESE;

D O I：

10.1109/MSN57253.2022.00090

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The multi-vehicle pursuit (MVP), as a problem abstracted from various real-world scenarios, is becoming a hot research topic in the Intelligent Transportation System (ITS). The combination of Artificial Intelligence (AI) and connected vehicles has greatly promoted the research development of MVP. However, existing works on MVP pay little attention to the importance of information exchange and cooperation among pursuing vehicles under the complex urban traffic environment. This paper proposed a graded-Q reinforcement learning with information-enhanced state encoder (GQRL-IESE) framework to address this hierarchical collaborative multi-vehicle pursuit (HCMVP) problem. In the GQRL-IESE, a cooperative graded Q scheme is proposed to facilitate the decision-making of pursuing vehicles to improve pursuing efficiency. Each pursuing vehicle further uses a deep Q network (DQN) to make decisions based on its encoded state. A coordinated Q optimizing network adjusts the individual decisions based on the current environment traffic information to obtain the global optimal action set. In addition, an information-enhanced state encoder is designed to extract critical information from multiple perspectives and uses the attention mechanism to assist each pursuing vehicle in effectively determining the target. Extensive experimental results based on SUMO indicate that the total timestep of the proposed GQRL-IESE is less than other methods on average by 47.64%, which demonstrates the excellent pursuing efficiency of the GQRL-IESE. Codes are outsourced in https://github.com/ANT-ITS/GQRL-IESE.

引用

页码：534 / 541

页数：8

共 4 条

[1] Progression Cognition Reinforcement Learning With Prioritized Experience for Multi-Vehicle Pursuit
Li, Xinhang
Yang, Yiying
Yuan, Zheng
Wang, Zhe
Wang, Qinwen
Xu, Chen
Li, Lei
He, Jianhua
Zhang, Lin
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, : 1 - 14
[2] An Opponent-Aware Reinforcement Learning Method for Team-to-Team Multi-Vehicle Pursuit via Maximizing Mutual Information Indicator
Wang, Qinwen
Li, Xinhang
Yuan, Zheng
Yang, Yiying
Xu, Chen
Zhang, Lin
[J]. 2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 526 - 533
[3] Deep Policy-Gradient Based Path Planning and Reinforcement Cooperative Q-Learning Behavior of Multi-Vehicle Systems
Afifi, Ahmed M.
Alhosainy, Omar H.
Elias, Catherine M.
Shehata, Omar M.
Morgan, Elsayed I.
[J]. 2019 IEEE INTERNATIONAL CONFERENCE OF VEHICULAR ELECTRONICS AND SAFETY (ICVES 19), 2019,
[4] T3OMVP: A Transformer-Based Time and Team Reinforcement Learning Scheme for Observation-Constrained Multi-Vehicle Pursuit in Urban Area
Yuan, Zheng
Wu, Tianhao
Wang, Qinwen
Yang, Yiying
Li, Lei
Zhang, Lin
[J]. ELECTRONICS, 2022, 11 (09)

← 1 →