共 50 条
- [43] Towards Offline Reinforcement Learning with Pessimistic Value Priors [J]. EPISTEMIC UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, EPI UAI 2023, 2024, 14523 : 89 - 100
- [44] Constraints Penalized Q-learning for Safe Offline Reinforcement Learning [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8753 - 8760
- [48] Greedy action selection and pessimistic Q-value updates in cooperative Q-learning [J]. 2018 57TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2018, : 821 - 826
- [49] Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 7154 - 7161
- [50] Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,