共 6 条
- [1] Silver D., Huang A., Maddison C., Et al., Mastering the game of Go with deep neural networks and tree search, Nature, 529, 7587, pp. 484-489, (2016)
- [2] Caflisch R.E., Monte Carlo and quasi-Monte Carlo methods, Acta Numerica, pp. 1-49, (1998)
- [3] Thrun S., Monte Carlo POMDPs, Advances in Neural Information Processing Systems, 12, pp. 1064-1070, (1999)
- [4] Littman M.L., Reinforcement learning improves behaviour from evaluative feedback, Nature, 521, 7553, pp. 445-451, (2015)
- [5] Tian Y., A simple analysis of AlphaGo, Acta Automatica, 42, 5, pp. 671-675, (2016)
- [6] Zhao D., Shao K., Zhu Y., Et al., Review of deep reinforcement learning and discussions on the development of computer Go, Control Theory & Applications, 33, 6, pp. 701-717, (2016)