共 50 条
- [41] Walking Motion Learning of Quadrupedal Walking Robot by Profit Sharing That Can Learn Deterministic Policy for POMDPs Environments SIMULATED EVOLUTION AND LEARNING (SEAL 2014), 2014, 8886 : 323 - 334
- [42] Future-Dependent Value-Based Off-Policy Evaluation in POMDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [45] Decentralized Learning of Finite-Memory Policies in Dec-POMDPs IFAC PAPERSONLINE, 2023, 56 (02): : 2601 - 2607
- [46] Search and Explore: Symbiotic Policy Synthesis in POMDPs COMPUTER AIDED VERIFICATION, CAV 2023, PT III, 2023, 13966 : 113 - 135
- [47] Factorized Asymptotic Bayesian Policy Search for POMDPs PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 4346 - 4352
- [48] A Role-based POMDPs Approach for Decentralized Implicit Cooperation of Multiple Agents 2017 13TH IEEE INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2017, : 496 - 501