Double action Q-learning for obstacle avoidance in a dynamically changing environment

被引:0
|
作者
Ngai, DCK [1 ]
Yung, NHC [1 ]
机构
[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Hong Kong, Peoples R China
关键词
Q-learning; reinforcement learning; temporal differences; obstacle avoidance;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov Decision Process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "Double Action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart form that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively.
引用
收藏
页码:211 / 216
页数:6
相关论文
共 50 条
  • [21] Action Candidate Driven Clipped Double Q-Learning for Discrete and Continuous Action Tasks
    Jiang, Haobo
    Li, Guangyu
    Xie, Jin
    Yang, Jian
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5269 - 5279
  • [22] Deep Reinforcement Learning with Double Q-Learning
    van Hasselt, Hado
    Guez, Arthur
    Silver, David
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2094 - 2100
  • [23] Continuous-Action Q-Learning
    José del R. Millán
    Daniele Posenato
    Eric Dedieu
    Machine Learning, 2002, 49 : 247 - 265
  • [24] Continuous-action Q-learning
    Millán, JDR
    Posenato, D
    Dedieu, E
    MACHINE LEARNING, 2002, 49 (2-3) : 247 - 265
  • [25] Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants
    Schilperoort, Jits
    Mak, Ivar
    Drugan, Madalina M.
    Wiering, Marco A.
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1151 - 1158
  • [26] On the Estimation Bias in Double Q-Learning
    Ren, Zhizhou
    Zhu, Guangxiang
    Hu, Hao
    Han, Beining
    Chen, Jianglun
    Zhang, Chongjie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [27] Plant adaptation to dynamically changing environment: The shade avoidance response
    Ruberti, I.
    Sessa, G.
    Ciolfi, A.
    Possenti, M.
    Carabelli, M.
    Morelli, G.
    BIOTECHNOLOGY ADVANCES, 2012, 30 (05) : 1047 - 1058
  • [28] A dynamic reward-enhanced Q-learning approach for efficient path planning and obstacle avoidance in mobile robotics
    Gharbi, Atef
    APPLIED COMPUTING AND INFORMATICS, 2024,
  • [29] Q-Learning Based Routing Protocol for Congestion Avoidance
    Godfrey, Daniel
    Kim, Beom-Su
    Miao, Haoran
    Shah, Babar
    Hayat, Bashir
    Khan, Imran
    Sung, Tae-Eung
    Kim, Ki-Il
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (03): : 3671 - 3692
  • [30] Research on Robot Obstacle Avoidance and Path Tracking under Dynamically Unknown Environment
    Nie Qingbin
    2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, : 2607 - 2610