共 50 条
- [37] Advantage Based Value Iteration for Markov Decision Processes with Unknown Rewards 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3837 - 3844
- [39] Efficient Off-Policy Algorithms for Structured Markov Decision Processes 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 8312 - 8319