Double action Q-learning for obstacle avoidance in a dynamically changing environment

被引：0

作者：

Ngai, DCK ^{[1
]}

Yung, NHC ^{[1
]}

机构：

[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Hong Kong, Peoples R China

来源：

2005 IEEE Intelligent Vehicles Symposium Proceedings | 2005年

关键词：

Q-learning; reinforcement learning; temporal differences; obstacle avoidance;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov Decision Process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "Double Action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart form that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively.

引用

页码：211 / 216

页数：6

共 50 条

[21] Action Candidate Driven Clipped Double Q-Learning for Discrete and Continuous Action Tasks
Jiang, Haobo
Li, Guangyu
Xie, Jin
Yang, Jian
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5269 - 5279
[22] Deep Reinforcement Learning with Double Q-Learning
van Hasselt, Hado
Guez, Arthur
Silver, David
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
[23] Continuous-Action Q-Learning
José del R. Millán
Daniele Posenato
Eric Dedieu
Machine Learning, 2002, 49 : 247 - 265
[24] Continuous-action Q-learning
Millán, JDR
Posenato, D
Dedieu, E
MACHINE LEARNING, 2002, 49 (2-3) : 247 - 265
[25] Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants
Schilperoort, Jits
Mak, Ivar
Drugan, Madalina M.
Wiering, Marco A.
2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1151 - 1158
[26] On the Estimation Bias in Double Q-Learning
Ren, Zhizhou
Zhu, Guangxiang
Hu, Hao
Han, Beining
Chen, Jianglun
Zhang, Chongjie
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[27] Plant adaptation to dynamically changing environment: The shade avoidance response
Ruberti, I.
Sessa, G.
Ciolfi, A.
Possenti, M.
Carabelli, M.
Morelli, G.
BIOTECHNOLOGY ADVANCES, 2012, 30 (05) : 1047 - 1058
[28] A dynamic reward-enhanced Q-learning approach for efficient path planning and obstacle avoidance in mobile robotics
Gharbi, Atef
APPLIED COMPUTING AND INFORMATICS, 2024,
[29] Q-Learning Based Routing Protocol for Congestion Avoidance
Godfrey, Daniel
Kim, Beom-Su
Miao, Haoran
Shah, Babar
Hayat, Bashir
Khan, Imran
Sung, Tae-Eung
Kim, Ki-Il
CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (03): : 3671 - 3692
[30] Research on Robot Obstacle Avoidance and Path Tracking under Dynamically Unknown Environment
Nie Qingbin
2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 2607 - 2610

← 1 2 3 4 5 →