The Wisdom of the Crowd: Reliable Deep Reinforcement Learning Through Ensembles of Q-Functions

被引：8

作者：

Elliott, Daniel L. ^{[1
]}

Anderson, Charles ^{[2
]}

机构：

[1] Lindsay Corp, Omaha, NE 68802 USA

[2] Colorado State Univ, Dept Comp Sci, Ft Collins, CO 80523 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 01期

关键词：

Training; Task analysis; Bagging; Stability criteria; Reinforcement learning; Neural networks; Computational modeling; Autonomous systems; machine learning algorithms; neural networks; NEURAL-NETWORKS;

D O I：

10.1109/TNNLS.2021.3089425

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) agents learn by exploring the environment and then exploiting what they have learned. This frees the human trainers from having to know the preferred action or intrinsic value of each encountered state. The cost of this freedom is that RL is slower and more unstable than supervised learning. We explore the possibility that ensemble methods can remedy these shortcomings by investigating a novel technique which harnesses the wisdom of crowds by combining Q-function approximator estimates utilizing a simple combination scheme similar to the supervised learning approach known as bagging. Bagging approaches have not yet found widespread adoption in the RL literature nor has a comprehensive look at its performance been performed. Our results show that the proposed approach improves all three tasks and RL approaches attempted. The primary contribution of this work is a demonstration that the improvement is a direct result of the increased stability of the action portion of the state-action-value function. Subsequent experimentation demonstrates that the stability in learning allows an actor-critic method to find more efficient solutions. Finally we show that this approach can be used to decrease the amount of time necessary to solve problems which require a deep Q-learning (DQN) approach.

引用

页码：43 / 51

页数：9

共 50 条

[31] Learn to Steer through Deep Reinforcement Learning
Wu, Keyu
Esfahani, Mahdi Abolfazli
Yuan, Shenghai
Wang, Han
SENSORS, 2018, 18 (11)
[32] Autonomous exploration through deep reinforcement learning
Yan, Xiangda
Huang, Jie
He, Keyan
Hong, Huajie
Xu, Dasheng
INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2023, 50 (05): : 793 - 803
[33] A framework of deep reinforcement learning for stock evaluation functions
Luo, Tai-Li
Wu, Mu-En
Chen, Chien-Ming
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 5639 - 5649
[34] Robot Navigation in Crowd Based on Dual Social Attention Deep Reinforcement Learning
Zeng, Hui
Hu, Rong
Huang, Xiaohui
Peng, Zhiying
MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021 (2021)
[35] Multi-Objective Deep Reinforcement Learning for Crowd Route Guidance Optimization
Nishida, Ryo
Tanigaki, Yuki
Onishi, Masaki
Hashimoto, Koichi
TRANSPORTATION RESEARCH RECORD, 2024, 2678 (05) : 617 - 633
[36] Decentralized Structural-RNN for Robot Crowd Navigation with Deep Reinforcement Learning
Liu, Shuijing
Chang, Peixin
Liang, Weihang
Chakraborty, Neeloy
Driggs-Campbell, Katherine
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 3517 - 3524
[37] Emotional Contagion-Aware Deep Reinforcement Learning for Antagonistic Crowd Simulation
Lv, Pei
Yu, Qingqing
Xu, Boya
Li, Chaochao
Zhou, Bing
Xu, Mingliang
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (04) : 2939 - 2953
[38] Efficient Training Management for Mobile Crowd-Machine Learning: A Deep Reinforcement Learning Approach
Tran The Anh
Nguyen Cong Luong
Niyato, Dusit
Kim, Dong In
Wang, Li-Chun
IEEE WIRELESS COMMUNICATIONS LETTERS, 2019, 8 (05) : 1345 - 1348
[39] Towards reliable robot packing system based on deep reinforcement learning
Xiong, Heng
Ding, Kai
Ding, Wan
Peng, Jian
Xu, Jianfeng
ADVANCED ENGINEERING INFORMATICS, 2023, 57
[40] Learning heuristics for weighted CSPs through deep reinforcement learning
Chen, Dingding
Chen, Ziyu
He, Zhongshi
Gao, Junsong
Su, Zhizhuo
APPLIED INTELLIGENCE, 2023, 53 (08) : 8844 - 8863

← 1 2 3 4 5 →