共 50 条
- [41] Near-optimal Reinforcement Learning in Factored MDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
- [42] Reinforcement learning for MDPs using temporal difference schemes PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 577 - 583
- [43] Path Consistency Learning in Tsallis Entropy Regularized MDPs INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
- [44] Learning models of relational MDPs using graph kernels MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 409 - +
- [45] Exploiting Additive Structure in Factored MDPs for Reinforcement Learning RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 15 - 26
- [46] Active Learning from Crowds with Unsure Option PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1061 - 1067
- [47] States evolution in Θ(λ)-learning based on logical MDPs with negation 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 2345 - 2350
- [48] Learning in Online MDPs: Is there a Price for Handling the Communicating Case? UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 293 - 302
- [49] Planning and Learning for Decentralized MDPs with Event Driven Rewards THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6186 - 6194
- [50] Inferring financial bubbles from option data JOURNAL OF APPLIED ECONOMETRICS, 2021, : 1013 - 1046