SMAC-tuned Deep Q-learning for Ramp Metering

被引:0
|
作者
ElSamadisy, Omar [1 ,3 ]
Abdulhai, Yazeed [1 ]
Xue, Haoyuan [2 ]
Smirnov, Ilia [1 ]
Khalil, Elias B. [2 ]
Abdulhai, Baher [1 ]
机构
[1] Univ Toronto, Dept Civil Engn, Toronto, ON, Canada
[2] Univ Toronto, Dept Mech & Ind Engn, Toronto, ON, Canada
[3] Arab Acad Sci Technol & Maritime Transport, Coll Engn & Technol, Dept Elect Commun Engn, Alexandria, Egypt
关键词
Ramp metering; Reinforcement learning; Hyperparameter tuning;
D O I
10.1109/SM57895.2023.10112246
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The demand for transportation increases as the population of a city grows, and significant expansion is not conceivable because of spatial, financial, and environmental limitations. As a result, improving infrastructure efficiency is becoming increasingly critical. Ramp metering with deep reinforcement learning (RL) is a method to tackle this problem. However, fine-tuning RL hyperparameters for RM is yet to be explored in the literature, potentially leaving performance improvements on the table. In this paper, the Sequential Model-based Algorithm Configuration (SMAC) method is used to finetune the values of two essential hyperparameters for the deep reinforcement learning ramp metering model, the discount factor and the decay of the explore/exploit ratio. Around 350 experiments with different configurations were run with PySMAC (a python interface to the hyperparameter optimization tool SMAC) and compared to Random search as a baseline. It is found that the best reward discount factor reflects that the RL agent should focus on immediate rewards and not pay much attention to future rewards. On the other hand, the selected value for the exploration ratio decay rate shows that the RL agent should prefer to decrease the exploration rate early. Both random search and SMAC show the same performance improvement of 19
引用
收藏
页码:65 / 72
页数:8
相关论文
共 50 条
  • [1] Ramp Metering Control Based on the Q-Learning Algorithm
    Ivanjko, Edouard
    Necoska, Daniela Koltovska
    Greguric, Martin
    Vujic, Miroslav
    Jurkovic, Goran
    Mandzuka, Sadko
    CYBERNETICS AND INFORMATION TECHNOLOGIES, 2015, 15 (05) : 88 - 97
  • [2] Motorway Ramp-Metering Control with Queuing Consideration using Q-Learning
    Davarynejad, Mohsen
    Hegyi, Andreas
    Vrancken, Jos
    van den Berg, Jan
    2011 14TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2011, : 1652 - 1658
  • [3] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
  • [4] Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning
    Ohnishi, Shota
    Uchibe, Eiji
    Yamaguchi, Yotaro
    Nakanishi, Kosuke
    Yasui, Yuji
    Ishii, Shin
    FRONTIERS IN NEUROROBOTICS, 2019, 13
  • [5] Discounted UCB1-tuned for Q-Learning
    Saito, Koki
    Notsu, Akira
    Honda, Katsuhiro
    2014 JOINT 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 15TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2014, : 966 - 970
  • [6] Deep Reinforcement Learning with Double Q-Learning
    van Hasselt, Hado
    Guez, Arthur
    Silver, David
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
  • [7] Comparison of Deep Q-Learning, Q-Learning and SARSA Reinforced Learning for Robot Local Navigation
    Anas, Hafiq
    Ong, Wee Hong
    Malik, Owais Ahmed
    ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 443 - 454
  • [8] Active deep Q-learning with demonstration
    Si-An Chen
    Voot Tangkaratt
    Hsuan-Tien Lin
    Masashi Sugiyama
    Machine Learning, 2020, 109 : 1699 - 1725
  • [9] Active deep Q-learning with demonstration
    Chen, Si-An
    Tangkaratt, Voot
    Lin, Hsuan-Tien
    Sugiyama, Masashi
    MACHINE LEARNING, 2020, 109 (9-10) : 1699 - 1725
  • [10] Hierarchical clustering with deep Q-learning
    Forster, Richard
    Fulop, Agnes
    ACTA UNIVERSITATIS SAPIENTIAE INFORMATICA, 2018, 10 (01) : 86 - 109