共 50 条
- [31] MAXIMAL REWARDS AND EPSILON-OPTIMAL POLICIES IN CONTINUOUS TIME MARKOV DECISION CHAINS ANNALS OF STATISTICS, 1974, 2 (01): : 159 - 169
- [32] Random Early Detection for Congestion Avoidance in Wired Networks: A Discretized Pursuit Learning-Automata-Like Solution IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2010, 40 (01): : 66 - 76
- [37] NONEXISTENCE OF EPSILON-OPTIMAL RANDOMIZED STATIONARY POLICIES IN AVERAGE COST MARKOV DECISION MODELS ANNALS OF MATHEMATICAL STATISTICS, 1971, 42 (05): : 1767 - &