Double action Q-learning for obstacle avoidance in a dynamically changing environment

被引：0

作者：

Ngai, DCK ^{[1
]}

Yung, NHC ^{[1
]}

机构：

[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Hong Kong, Peoples R China

来源：

2005 IEEE Intelligent Vehicles Symposium Proceedings | 2005年

关键词：

Q-learning; reinforcement learning; temporal differences; obstacle avoidance;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov Decision Process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "Double Action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart form that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively.

引用

页码：211 / 216

页数：6

共 50 条

[41] Neural Q Learning Algorithm based UAV Obstacle Avoidance
Zhou, Benchun
Wang, Weihong
Wang, Zhifeng
Ding, Baoyang
2018 IEEE CSAA GUIDANCE, NAVIGATION AND CONTROL CONFERENCE (CGNCC), 2018,
[42] State and Action Space Segmentation Algorithm in Q-learning
Notsu, Akira
Ichihashi, Hidetomo
Honda, Katsuhiro
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 2384 - 2389
[43] Accelerated Q-Learning for Fail State and Action Spaces
Park, In-Won
Kim, Jong-Hwan
Park, Kui-Hong
2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 763 - +
[44] Fuzzy Q-learning in continuous state and action space
Xu M.-L.
Xu W.-B.
Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109
[45] Fuzzy Q-learning in continuous state and action space
XU Ming-liang1
TheJournalofChinaUniversitiesofPostsandTelecommunications, 2010, 17 (04) : 100 - 109
[46] A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment
Hung, Shao-Ming
Givigi, Sidney N.
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (01) : 186 - 197
[47] Extending the BDI Model with Q-learning in Uncertain Environment
Wan, Qian
Liu, Wei
Xu, Longlong
Guo, Jingzhi
2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
[48] A Double Q-Learning Routing in Delay Tolerant Networks
Yuan, Fan
Wu, Jaogao
Zhou, Hongyu
Liu, Linfeng
ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
[49] FAQ: A Flexible Accelerator for Q-Learning with Configurable Environment
Rothmann, Marc
Porrmann, Mario
2022 IEEE 33RD INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2022, : 106 - 114
[50] Traffic Signal Control: a Double Q-learning Approach
Agafonov, Anton
Myasnikov, Vladislav
PROCEEDINGS OF THE 2021 16TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2021, : 365 - 369

← 1 2 3 4 5 →