The Wisdom of the Crowd: Reliable Deep Reinforcement Learning Through Ensembles of Q-Functions

被引:8
|
作者
Elliott, Daniel L. [1 ]
Anderson, Charles [2 ]
机构
[1] Lindsay Corp, Omaha, NE 68802 USA
[2] Colorado State Univ, Dept Comp Sci, Ft Collins, CO 80523 USA
关键词
Training; Task analysis; Bagging; Stability criteria; Reinforcement learning; Neural networks; Computational modeling; Autonomous systems; machine learning algorithms; neural networks; NEURAL-NETWORKS;
D O I
10.1109/TNNLS.2021.3089425
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) agents learn by exploring the environment and then exploiting what they have learned. This frees the human trainers from having to know the preferred action or intrinsic value of each encountered state. The cost of this freedom is that RL is slower and more unstable than supervised learning. We explore the possibility that ensemble methods can remedy these shortcomings by investigating a novel technique which harnesses the wisdom of crowds by combining Q-function approximator estimates utilizing a simple combination scheme similar to the supervised learning approach known as bagging. Bagging approaches have not yet found widespread adoption in the RL literature nor has a comprehensive look at its performance been performed. Our results show that the proposed approach improves all three tasks and RL approaches attempted. The primary contribution of this work is a demonstration that the improvement is a direct result of the increased stability of the action portion of the state-action-value function. Subsequent experimentation demonstrates that the stability in learning allows an actor-critic method to find more efficient solutions. Finally we show that this approach can be used to decrease the amount of time necessary to solve problems which require a deep Q-learning (DQN) approach.
引用
收藏
页码:43 / 51
页数:9
相关论文
共 50 条
  • [31] Learn to Steer through Deep Reinforcement Learning
    Wu, Keyu
    Esfahani, Mahdi Abolfazli
    Yuan, Shenghai
    Wang, Han
    SENSORS, 2018, 18 (11)
  • [32] Autonomous exploration through deep reinforcement learning
    Yan, Xiangda
    Huang, Jie
    He, Keyan
    Hong, Huajie
    Xu, Dasheng
    INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2023, 50 (05): : 793 - 803
  • [33] A framework of deep reinforcement learning for stock evaluation functions
    Luo, Tai-Li
    Wu, Mu-En
    Chen, Chien-Ming
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5639 - 5649
  • [34] Robot Navigation in Crowd Based on Dual Social Attention Deep Reinforcement Learning
    Zeng, Hui
    Hu, Rong
    Huang, Xiaohui
    Peng, Zhiying
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021 (2021)
  • [35] Multi-Objective Deep Reinforcement Learning for Crowd Route Guidance Optimization
    Nishida, Ryo
    Tanigaki, Yuki
    Onishi, Masaki
    Hashimoto, Koichi
    TRANSPORTATION RESEARCH RECORD, 2024, 2678 (05) : 617 - 633
  • [36] Decentralized Structural-RNN for Robot Crowd Navigation with Deep Reinforcement Learning
    Liu, Shuijing
    Chang, Peixin
    Liang, Weihang
    Chakraborty, Neeloy
    Driggs-Campbell, Katherine
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 3517 - 3524
  • [37] Emotional Contagion-Aware Deep Reinforcement Learning for Antagonistic Crowd Simulation
    Lv, Pei
    Yu, Qingqing
    Xu, Boya
    Li, Chaochao
    Zhou, Bing
    Xu, Mingliang
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2939 - 2953
  • [38] Efficient Training Management for Mobile Crowd-Machine Learning: A Deep Reinforcement Learning Approach
    Tran The Anh
    Nguyen Cong Luong
    Niyato, Dusit
    Kim, Dong In
    Wang, Li-Chun
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2019, 8 (05) : 1345 - 1348
  • [39] Towards reliable robot packing system based on deep reinforcement learning
    Xiong, Heng
    Ding, Kai
    Ding, Wan
    Peng, Jian
    Xu, Jianfeng
    ADVANCED ENGINEERING INFORMATICS, 2023, 57
  • [40] Learning heuristics for weighted CSPs through deep reinforcement learning
    Chen, Dingding
    Chen, Ziyu
    He, Zhongshi
    Gao, Junsong
    Su, Zhizhuo
    APPLIED INTELLIGENCE, 2023, 53 (08) : 8844 - 8863