Optimal policies for constrained average-cost Markov decision processes

被引：0

作者：

Juan González-Hernández

César E. Villarreal

机构：

[1] Universidad Nacional Autónoma de México,Departamento de Probabilidad y Estadística, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas

[2] Universidad Autónoma de Nuevo León,Posgrado en Ingeniería de Sistemas, Facultad de Ingeniería Mecánica y Eléctrica

[3] Ciudad Universitaria,undefined

来源：

TOP | 2011年 / 19卷

关键词：

Markov decision processes; Constraints; Stable measures; 90C40;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We give mild conditions for the existence of optimal solutions for a Markov decision problem with average cost, under m constraints of the same kind, in Borel actions and states spaces. Moreover, there is an optimal policy that is a convex combination of at most m+1 deterministic policies.

引用

页码：107 / 120

页数：13

共 50 条

[41] AVERAGE OPTIMAL STATIONARY POLICIES AND LINEAR-PROGRAMMING IN COUNTABLE SPACE MARKOV DECISION-PROCESSES
LASSERRE, JB
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1994, 183 (01) : 233 - 249
[42] A Rollout Algorithm for Multichain Markov Decision Processes with Average Cost
Sun, Tao
Zhao, Qianchuan
Luh, Peter B.
[J]. POSITIVE SYSTEMS, PROCEEDINGS, 2009, 389 : 151 - 162
[43] AVERAGE COST MARKOV DECISION-PROCESSES - OPTIMALITY CONDITIONS
HERNANDEZLERMA, O
HENNET, JC
LASSERRE, JB
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1991, 158 (02) : 396 - 406
[44] Constrained continuous-time Markov decision processes with average criteria
Lanlan Zhang
Xianping Guo
[J]. Mathematical Methods of Operations Research, 2008, 67 : 323 - 340
[45] Constrained continuous-time Markov decision processes with average criteria
Zhang, Lanlan
Guo, Xianping
[J]. MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2008, 67 (02) : 323 - 340
[46] AVERAGE COST OPTIMAL POLICIES FOR MARKOV CONTROL PROCESSES WITH BOREL STATE-SPACE AND UNBOUNDED COSTS
HERNANDEZLERMA, O
LASSERRE, JB
[J]. SYSTEMS & CONTROL LETTERS, 1990, 15 (04) : 349 - 356
[47] The computation of average optimal policies in denumerable state Markov decision chains
Sennott, LI
[J]. ADVANCES IN APPLIED PROBABILITY, 1997, 29 (01) : 114 - 137
[48] Constrained Markov Decision Processes with Total Expected Cost Criteria
Altman, Eitan
Boularouk, Said
Josselin, Didier
[J]. PROCEEDINGS OF THE 12TH EAI INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS (VALUETOOLS 2019), 2019, : 191 - 192
[49] Conditions for the uniqueness of optimal policies of discounted Markov decision processes
Daniel Cruz-Suárez
Raúl Montes-de-Oca
Francisco Salem-Silva
[J]. Mathematical Methods of Operations Research, 2004, 60 : 415 - 436
[50] Reinforcement Learning of Risk-Constrained Policies in Markov Decision Processes
Brazdil, Tomas
Chatterjee, Krishnendu
Novotny, Petr
Vahala, Jiri
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9794 - 9801

← 1 2 3 4 5 →