Traffic signal timing method based on deep reinforcement learning and extended Kalman filter

被引:0
|
作者
Wu L. [1 ]
Wu Y. [1 ]
Kong F. [2 ]
Li B. [1 ]
机构
[1] College of Electrical Engineering, Henan University of Technology, Zhengzhou
[2] College of Electrical Engineering, Zhengzhou Railway Vocationaland Technical College, Zhengzhou
基金
中国国家自然科学基金;
关键词
decision making ability; deep Q-learning network (DQN); extended Kalman filter (EKF); parameter uncertainty; perception ability; traffic signal timing system;
D O I
10.13700/j.bh.1001-5965.2021.0529
中图分类号
学科分类号
摘要
The deep Q-learning network (DQN) has become an effective method to solve the traffic signal timing problem because of its strong perception and decision-making ability. However, in the field of traffic signal timing systems, the problem of parameter uncertainty caused by external environment disturbance and internal parameter fluctuation limits its further development. Based on this, a traffic signal timing method combining DQN and extended Kalman filter (DQN-EKF) is proposed. In this method, the uncertain parameters of the estimated network are taken as the state variables, and the target network values with uncertain parameters are taken as the observed variables. The EKF system equation is constructed by combining the process noise, the estimated network values with uncertain parameters and the system observation noise. The optimal estimation of the parameters in the DQN model is obtained through the iterative updating of the EKF Uncertainty. The experimental results show that the DQN-EKF timing algorithm is suitable for different traffic environments and can effectively improve the traffic efficiency of vehicles. © 2022 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.
引用
收藏
页码:1353 / 1363
页数:10
相关论文
共 21 条
  • [1] ROBERTSON D I, BRETHERTON R D., Optimizing networks of traffic signals in real time-The SCOOT method, IEEE Transactions on Vehicular Technology, 40, 1, pp. 11-15, (1991)
  • [2] LOWRIE P R., SCATS: The Sydney coordinated adaptive traffic system principles, methodology, algorithms, International Conference on Road Traffic Signalling, (1982)
  • [3] WANG S C., Design of traffic signal controller based on BP fuzzy neural network, Journal of Yunnan University for Nationalities(Natural Science Edition), 20, 6, pp. 511-514, (2011)
  • [4] HU Z P., An improved optimization method for real-time control of intersection lights based on genetic algorithm, Shandong Industrial Technology, 206, 24, pp. 110-111, (2015)
  • [5] BOUDERBA S I, MOUSSA N., Reinforcement learning (Q-LEARNING) traffic light controller within intersection traffic system, Proceedings of the 4th International Conference on Big Data and Internet of Things, pp. 1-6, (2019)
  • [6] BUSCH J, LATZKO V, REISSLEIN M, Et al., Optimised traffic light management through reinforcement learning: Traffic state agnostic agent vs. holistic agent with current V2I traffic state knowledge, IEEE Open Journal of Intelligent Transportation Systems, 1, pp. 201-216, (2020)
  • [7] GUO J, HARMATI I., Comparison of game theoretical strategy and reinforcement learning in traffic light control, Periodica Polytechnica Transportation Engineering, 48, 4, pp. 313-319, (2020)
  • [8] GARG D, CHLI M, VOGIATZIS G., Deep reinforcement learning for autonomous traffic light control, 2018 3rd IEEE International Conference on Intelligent Transportation Engineering (ICITE), pp. 214-218, (2018)
  • [9] GENDERS W, RAZAVI S, ASCE A M., Policy analysis of adaptive traffic signal control using reinforcement learning, Journal of Computing in Civil Engineering, 34, 1, (2019)
  • [10] ZHOU X, FEI Z, QUAN L, Et al., A Sarsa(λ)-based control model for real-time traffic light coordination, The Scientific World Journal, 2014, (2014)