Assessment of Reward Functions for Reinforcement Learning Traffic Signal Control under Real-World Limitations

被引:0
|
作者
Egea, Alvaro Cabrejas [1 ,2 ]
Howell, Shaun [2 ]
Knutins, Maksis [2 ]
Connaughton, Colm [3 ]
机构
[1] Univ Warwick, MathSys Ctr Doctoral Training, Coventry CV4 7AL, W Midlands, England
[2] Vivac Labs, London NW5 3AQ, England
[3] Univ Warwick, Warwick Math Inst, Coventry CV4 7AL, W Midlands, England
基金
“创新英国”项目;
关键词
Reinforcement Learning; Urban Traffic Control; Smart Cities; Agent-Based Modeling;
D O I
10.1109/smc42975.2020.9283498
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Adaptive traffic signal control is one key avenue for mitigating the growing consequences of traffic congestion. Incumbent solutions such as SCOOT and SCATS require regular and time-consuming calibration, can't optimise well for multiple road use modalities, and require the manual curation of many implementation plans. A recent alternative to these approaches are deep reinforcement learning algorithms, in which an agent learns how to take the most appropriate action for a given state of the system. This is guided by neural networks approximating a reward function that provides feedback to the agent regarding the performance of the actions taken, making it sensitive to the specific reward function chosen. Several authors have surveyed the reward functions used in the literature, but attributing outcome differences to reward function choice across works is problematic as there are many uncontrolled differences, as well as different outcome metrics. This paper compares the performance of agents using different reward functions in a simulation of a junction in Greater Manchester, UK, across various demand profiles, subject to real world constraints: realistic sensor inputs, controllers, calibrated demand, intergreen times and stage sequencing. The reward metrics considered are based on the time spent stopped, lost time, change in lost time, average speed, queue length, junction throughput and variations of these magnitudes. The performance of these reward functions is compared in terms of total waiting time. We find that speed maximisation resulted in the lowest average waiting times across all demand levels, displaying significantly better performance than other rewards previously introduced in the literature.
引用
收藏
页码:965 / 972
页数:8
相关论文
共 50 条
  • [1] Assessment of Reward Functions in Reinforcement Learning for Multi-Modal Urban Traffic Control under Real-World limitations
    Egea, Alvaro Cabrejas
    Connaughton, Colm
    [J]. 2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2095 - 2102
  • [2] Towards Real-World Deployment of Reinforcement Learning for Traffic Signal Control
    Mueller, Arthur
    Rangras, Vishal
    Ferfers, Tobias
    Hufen, Florian
    Schreckenberg, Lukas
    Jasperneite, Juergen
    Schnittker, Georg
    Waldmann, Michael
    Friesen, Maxim
    Wiering, Marco
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 507 - 514
  • [3] Effects analysis of reward functions on reinforcement learning for traffic signal control
    Lee, Hyosun
    Han, Yohee
    Kim, Youngchan
    Kim, Yong Hoon
    [J]. PLOS ONE, 2022, 17 (11):
  • [4] First steps towards real-world traffic signal control optimisation by reinforcement learning
    Meess, Henri
    Gerner, Jeremias
    Hein, Daniel
    Schmidtner, Stefanie
    Elger, Gordon
    Bogenberger, Klaus
    [J]. JOURNAL OF SIMULATION, 2024,
  • [5] Adaptive Traffic Signal Control : Exploring Reward Definition For Reinforcement Learning
    Touhbi, Saad
    Babram, Mohamed Ait
    Tri Nguyen-Huu
    Marilleau, Nicolas
    Hbid, Moulay L.
    Cambier, Christophe
    Stinckwich, Serge
    [J]. 8TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT-2017) AND THE 7TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT 2017), 2017, 109 : 513 - 520
  • [6] Multi-Agent Deep Reinforcement Learning For Real-World Traffic Signal Controls - A Case Study
    Friesen, Maxim
    Tan, Tian
    Jasperneite, Juergen
    Wang, Jie
    [J]. 2022 IEEE 20TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2022, : 162 - 169
  • [7] Deep reinforcement learning for traffic signal control with consistent state and reward design approach
    Bouktif, Salah
    Cheniki, Abderraouf
    Ouni, Ali
    El-Sayed, Hesham
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 267
  • [8] Reinforcement Learning Approaches for Traffic Signal Control under Missing Data
    Mei, Hao
    Li, Junxian
    Shi, Bin
    Wei, Hua
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 2261 - 2269
  • [9] On the Role of Reward Functions for Reinforcement Learning in the Traffic Assignment Problem
    Grunitzki, Ricardo
    Ramos, Gabriel de Oliveira
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [10] A meta-reinforcement learning algorithm for traffic signal control to automatically switch different reward functions according to the saturation level of traffic flows
    Kim, Gyeongjun
    Kang, Jiwon
    Sohn, Keemin
    [J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2023, 38 (06) : 779 - 798