共 50 条
- [32] Learning to Pour using Deep Deterministic Policy Gradients 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 3074 - 3079
- [33] Action control, forward models and expected rewards: representations in reinforcement learning Synthese, 2021, 199 : 14017 - 14033
- [34] Inverse Reinforcement Learning with Explicit Policy Estimates THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9472 - 9480
- [36] Reward Certification for Policy Smoothed Reinforcement Learning THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21429 - 21437
- [37] Integrating Classical Control into Reinforcement Learning Policy Neural Processing Letters, 2021, 53 : 1709 - 1722
- [38] Unified Policy Optimization for Robust Reinforcement Learning ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 395 - 410
- [39] A modification of gradient policy in reinforcement learning procedure 2012 15TH INTERNATIONAL CONFERENCE ON INTERACTIVE COLLABORATIVE LEARNING (ICL), 2012,
- [40] Model-free Policy Learning with Reward Gradients INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151