Distributed optimization of Markov reward processes

被引：0

作者：

Campos-Nane, Enrique ^{[1
]}

机构：

[1] George Washington Univ, Dept Engn Management & Syst Engn, Washington, DC 20052 USA

来源：

PROCEEDINGS OF THE 46TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14 | 2007年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Dynamic programming provides perhaps the most natural way to model many control problems, but suffers from the fact that existing solution procedures do not scale gracefully with the size of the problem. In this work, we present a gradient-based policy search technique that exploits the fact that in many applications the state space and control actions are naturally distributed. After presenting our modeling assumptions, we introduce a technique in which a set of distributed agents compute an estimate of the partial derivative of a system-wide objective with respect to the parameters under their control and use it in a gradient-based policy search procedure. We illustrate the algorithm with an application to energy-efficient coverage in energy harvesting sensor networks. The resulting algorithm can be implemented using only local information available to the sensors, and is therefore fully scalable. Our numerical results are encouraging and allow us to conjecture the usefulness of our approach.

引用

下载

页码：3921 / 3926

页数：6

共 50 条

[1] Adaptive optimization of Markov reward processes
Campos-Nanez, Enrique
Patek, Stephen D.
2005 44TH IEEE CONFERENCE ON DECISION AND CONTROL & EUROPEAN CONTROL CONFERENCE, VOLS 1-8, 2005, : 8034 - 8041
[2] Simulation-based optimization of Markov reward processes
Marbach, P
Tsitsiklis, JN
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2001, 46 (02) : 191 - 209
[3] Simulation-based optimization of Markov reward processes
Marbach, Peter
Tsitsiklis, John N.
Proceedings of the IEEE Conference on Decision and Control, 1998, 3 : 2698 - 2703
[4] Simulation-based optimization of Markov reward processes
Marbach, P
Tsitsiklis, JN
PROCEEDINGS OF THE 37TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 1998, : 2698 - 2703
[5] Markov reward models and markov decision processes in discrete and continuous time: Performance evaluation and optimization
Gouberman, Alexander
Siegle, Markus
Gouberman, Alexander (alexander.gouberman@unibw.de), 1600, Springer Verlag (8453): : 156 - 241
[6] Markov Decision Processes with Arbitrary Reward Processes
Yu, Jia Yuan
Mannor, Shie
Shimkin, Nahum
RECENT ADVANCES IN REINFORCEMENT LEARNING, 2008, 5323 : 268 - +
[7] Markov Decision Processes with Arbitrary Reward Processes
Yu, Jia Yuan
Mannor, Shie
Shimkin, Nahum
MATHEMATICS OF OPERATIONS RESEARCH, 2009, 34 (03) : 737 - 757
[8] Optimal control in Markov decision processes via distributed optimization
Fu, Jie
Han, Shuo
Topcu, Ufuk
2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 7462 - 7469
[9] Approximate gradient methods in policy-space optimization of Markov reward processes
Marbach, P
Tsitsiklis, JN
DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2003, 13 (1-2): : 111 - 148
[10] Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes
Peter Marbach
John N. Tsitsiklis
Discrete Event Dynamic Systems, 2003, 13 : 111 - 148

← 1 2 3 4 5 →