A NEW PARALLELIZED OF HIERARCHICAL VALUE ITERATION ALGORITHM FOR DISCOUNTED MARKOV DECISION PROCESSES

被引：0

作者：

Nachaoui, Mourad ^{[1
]}

Chafik, Sanae ^{[2
]}

Daoui, Cherki ^{[2
]}

机构：

[1] FST Bni Mellal Univ Sultan Moulay Slimane B P, Beni Mellal, Morocco

[2] FST Bni Mellal Univ Sultan Moulay Slimane B P, Beni Mellal, Morocco

来源：

DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS-SERIES S | 2022年

关键词：

  Markov decision process; value iteration algorithm; message passing interface; sequential; parallel computing; big-data; COMPLEXITY;

D O I：

10.3934/dcdss.2022189

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Markov Decision Process (MDP) is a popular mathematical frame-work for modeling stochastic sequential problems under uncertainty. These models appear in many applications, such as computer science, engineering, telecommunications, and finance, among others. One of the most challenging goals is to deal with complexity reduction in the case of large MDP. In this paper; we propose an optimal strategy deals with large MDP under discount reward. The proposed approach is based on an intelligent combination of a decomposition technique and an efficient parallel strategy. The global MDP is splitting into several "sub-MDPs", subsequently, these MDPs are classified by level following the strongly connected components principle. A master-slave strategy base on Message Passing Interface (MPI) is proposed to solve the ob-tained problem. The performance of the proposed approach is shown in terms of scalability, cost, and execution speed.

引用

页数：14

共 50 条

[1] A Modified Value Iteration Algorithm for Discounted Markov Decision Processes
Chafik, Sanaa
Daoui, Cherki
[J]. JOURNAL OF ELECTRONIC COMMERCE IN ORGANIZATIONS, 2015, 13 (03) : 47 - 57
[2] THE CONVERGENCE OF VALUE-ITERATION IN DISCOUNTED MARKOV DECISION-PROCESSES
WHITE, DJ
SCHERER, WT
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1994, 182 (02) : 348 - 360
[3] Uniform convergence of value iteration policies for discounted Markov decision processes
Cruz-Suarez, Daniel
Montes-De-Oca, Raul
[J]. BOLETIN DE LA SOCIEDAD MATEMATICA MEXICANA, 2006, 12 (01): : 133 - 148
[4] Topological Value Iteration Algorithm for Markov Decision Processes
Dai, Peng
Goldsmith, Judy
[J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1860 - 1865
[5] Value Iteration and Action ε-Approximation of Optimal Policies in Discounted Markov Decision Processes
Montes-De-Oca, Raul
Lemus-Rodriguez, Enrique
[J]. RECENT ADVANCES IN APPLIED MATHEMATICS, 2009, : 213 - +
[6] MONOTONE VALUE-ITERATION FOR DISCOUNTED FINITE MARKOV DECISION-PROCESSES
WHITE, DJ
[J]. JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1985, 109 (02) : 311 - 324
[7] New prioritized value iteration for Markov decision processes
de Guadalupe Garcia-Hernandez, Ma.
Ruiz-Pinales, Jose
Onaindia, Eva
Gabriel Avina-Cervantes, J.
Ledesma-Orozco, Sergio
Alvarado-Mendez, Edgar
Reyes-Ballesteros, Alberto
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2012, 37 (02) : 157 - 167
[8] New prioritized value iteration for Markov decision processes
Ma. de Guadalupe Garcia-Hernandez
Jose Ruiz-Pinales
Eva Onaindia
J. Gabriel Aviña-Cervantes
Sergio Ledesma-Orozco
Edgar Alvarado-Mendez
Alberto Reyes-Ballesteros
[J]. Artificial Intelligence Review, 2012, 37 : 157 - 167
[9] An optimistic value iteration for mean-variance optimization in discounted Markov decision processes
Ma, Shuai
Ma, Xiaoteng
Xia, Li
[J]. RESULTS IN CONTROL AND OPTIMIZATION, 2022, 8
[10] COMPUTATIONAL COMPARISON OF VALUE-ITERATION ALGORITHMS FOR DISCOUNTED MARKOV DECISION-PROCESSES
THOMAS, LC
HARLEY, R
LAVERCOMBE, AC
[J]. OPERATIONS RESEARCH LETTERS, 1983, 2 (02) : 72 - 76

← 1 2 3 4 5 →