Double action Q-learning for obstacle avoidance in a dynamically changing environment

被引:0
|
作者
Ngai, DCK [1 ]
Yung, NHC [1 ]
机构
[1] Univ Hong Kong, Dept Elect & Elect Engn, Hong Kong, Hong Kong, Peoples R China
来源
2005 IEEE Intelligent Vehicles Symposium Proceedings | 2005年
关键词
Q-learning; reinforcement learning; temporal differences; obstacle avoidance;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a new method for solving the reinforcement learning problem in a dynamically changing environment, as in vehicle navigation, in which the Markov Decision Process used in traditional reinforcement learning is modified so that the response of the environment is taken into consideration for determining the agent's next state. This is achieved by changing the action-value function to handle three parameters at a time, namely, the current state, action taken by the agent, and action taken by the environment. As it considers the actions by the agent and environment, it is termed "Double Action". Based on the Q-learning method, the proposed method is implemented and the update rule is modified to handle all of the three parameters. Preliminary results show that the proposed method has the sum of rewards (negative) 89.5% less than that of the traditional method. Apart form that, our new method also has the total number of collisions and mean steps used in one episode 89.5% and 15.5% lower than that of the traditional method respectively.
引用
收藏
页码:211 / 216
页数:6
相关论文
共 50 条
  • [41] Neural Q Learning Algorithm based UAV Obstacle Avoidance
    Zhou, Benchun
    Wang, Weihong
    Wang, Zhifeng
    Ding, Baoyang
    2018 IEEE CSAA GUIDANCE, NAVIGATION AND CONTROL CONFERENCE (CGNCC), 2018,
  • [42] State and Action Space Segmentation Algorithm in Q-learning
    Notsu, Akira
    Ichihashi, Hidetomo
    Honda, Katsuhiro
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 2384 - 2389
  • [43] Accelerated Q-Learning for Fail State and Action Spaces
    Park, In-Won
    Kim, Jong-Hwan
    Park, Kui-Hong
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 763 - +
  • [44] Fuzzy Q-learning in continuous state and action space
    Xu M.-L.
    Xu W.-B.
    Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109
  • [46] A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment
    Hung, Shao-Ming
    Givigi, Sidney N.
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (01) : 186 - 197
  • [47] Extending the BDI Model with Q-learning in Uncertain Environment
    Wan, Qian
    Liu, Wei
    Xu, Longlong
    Guo, Jingzhi
    2018 INTERNATIONAL CONFERENCE ON ALGORITHMS, COMPUTING AND ARTIFICIAL INTELLIGENCE (ACAI 2018), 2018,
  • [48] A Double Q-Learning Routing in Delay Tolerant Networks
    Yuan, Fan
    Wu, Jaogao
    Zhou, Hongyu
    Liu, Linfeng
    ICC 2019 - 2019 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2019,
  • [49] FAQ: A Flexible Accelerator for Q-Learning with Configurable Environment
    Rothmann, Marc
    Porrmann, Mario
    2022 IEEE 33RD INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2022, : 106 - 114
  • [50] Traffic Signal Control: a Double Q-learning Approach
    Agafonov, Anton
    Myasnikov, Vladislav
    PROCEEDINGS OF THE 2021 16TH CONFERENCE ON COMPUTER SCIENCE AND INTELLIGENCE SYSTEMS (FEDCSIS), 2021, : 365 - 369