共 50 条
- [41] Optimism in Face of a Context: Regret Guarantees for Stochastic Contextual MDP THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 7, 2023, : 8510 - 8517
- [42] Regret Analysis for RL using Renewal Bandit Feedback 2022 IEEE INFORMATION THEORY WORKSHOP (ITW), 2022, : 137 - 142
- [43] Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management Operations Research, 2022, 70 (03): : 1646 - 1664
- [45] Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management ACM EC '19: PROCEEDINGS OF THE 2019 ACM CONFERENCE ON ECONOMICS AND COMPUTATION, 2019, : 743 - 744
- [46] PDFA Distillation with Error Bound Guarantees IMPLEMENTATION AND APPLICATION OF AUTOMATA, CIAA 2024, 2024, 15015 : 51 - 65
- [47] Discovery and density estimation of latent confounders in Bayesian networks with evidence lower bound INTERNATIONAL CONFERENCE ON PROBABILISTIC GRAPHICAL MODELS, VOL 186, 2022, 186
- [50] Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,