共 14 条
- [2] Self-Imitation Learning via Generalized Lower Bound Q-learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [3] Learning Robotic Skills via Self-Imitation and Guide Reward 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2158 - 2163
- [5] Harnessing Network Effect for Fake News Mitigation: Selecting Debunkers via Self-Imitation Learning THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22447 - 22456
- [7] Self-Practice Imitation Learning from Weak Policy PARTIALLY SUPERVISED LEARNING, PSL 2013, 2013, 8193 : 9 - 20
- [8] Offline Reinforcement Learning via Policy Regularization and Ensemble Q-Functions 2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 1167 - 1174
- [9] Self-Adaptive Imitation Learning: Learning Tasks with Delayed Rewards from Sub-optimal Demonstrations THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9269 - 9277
- [10] DROID: Learning from Offline Heterogeneous Demonstrations via Reward-Policy Distillation CONFERENCE ON ROBOT LEARNING, VOL 229, 2023, 229