Decentralized approximate dynamic programming for dynamic networks of agents

被引:6
|
作者
Lakshmanan, Hariharan [1 ]
Pucci de Farias, Daniela [2 ]
机构
[1] MIT, Dept Civil Engn, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Dept Mech Engn, Cambridge, MA 02139 USA
关键词
D O I
10.1109/ACC.2006.1656455
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider control systems consisting of teams of agents operating in stochastic environments and communicating through a network with dynamic topology. An optimal centralized control policy can be derived from the Q-function associated with the problem. However, computing and storing the Q-function is intractable for systems of practical scale, and having a centralized policy may lead to prohibitive requirements on communication between agents. On the other hand, it has been shown that decentralized optimal control is NP-hard even in the case of small systems. Here we propose a general approach for decentralized control based on approximate dynamic programming. We consider approximations to the Q-function via local approximation architectures, which lead to decentralization of the task of choosing control actions and can be computed and stored efficiently. We propose and analyze an approximate dynamic programming approach for fitting the Q-function based on linear programming. We show that error bounds previously developed for cost-to-go function approximation via linear programming can be extended to the case of Q-function approximation. We then consider the problem of decentralizing the task of approximating the Q-function and show that it can be viewed as a resource allocation problem. Motivated by this observation, we propose a decentralized gradient-based algorithm for solving a class of resource allocation problems. Convergence of the algorithm is established and its convergence rate, measured in terms of the number of iterations required for magnitude of the gradient to approach zero, is shown to be O(n(2.5)), where n is the number of agents in the network.
引用
收藏
页码:1648 / +
页数:2
相关论文
共 50 条
  • [21] Opportunistic Fair Scheduling in Wireless Networks: An Approximate Dynamic Programming Approach
    Zhang, Zhi
    Moola, Sudhir
    Chong, Edwin K. P.
    MOBILE NETWORKS & APPLICATIONS, 2010, 15 (05): : 710 - 728
  • [22] Topology discovery in dynamic and decentralized networks with mobile agents and swarm intelligence
    Nassu, Bogdan T.
    Nanya, Takashi
    Duarte, Elias P., Jr.
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 685 - +
  • [23] Feature Discovery in Approximate Dynamic Programming
    Preux, Philippe
    Girgin, Sertan
    Loth, Manuel
    ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 109 - +
  • [24] Approximate dynamic programming with a fuzzy parameterization
    Busoniu, Lucian
    Ernst, Damien
    De Schutter, Bart
    Babuska, Robert
    AUTOMATICA, 2010, 46 (05) : 804 - 814
  • [25] Approximate dynamic programming for container stacking
    Boschma, Rene
    Mes, Martijn R. K.
    de Vries, Leon R.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 310 (01) : 328 - 342
  • [26] Approximate dynamic programming for stochastic reachability
    Kariotoglou, Nikolaos
    Summers, Sean
    Summers, Tyler
    Kamgarpour, Maryam
    Lygeros, John
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 584 - 589
  • [27] Approximate dynamic programming with Gaussian processes
    Deisenroth, Marc P.
    Peters, Jan
    Rasmussen, Carl E.
    2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 4480 - +
  • [28] Approximate dynamic programming for sensor management
    Castanon, DA
    PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 1202 - 1207
  • [29] Dynamic Programming for Approximate Expansion Algorithm
    Veksler, Olga
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 : 850 - 863
  • [30] Bayesian Exploration for Approximate Dynamic Programming
    Ryzhov, Ilya O.
    Mes, Martijn R. K.
    Powell, Warren B.
    van den Berg, Gerald
    OPERATIONS RESEARCH, 2019, 67 (01) : 198 - 214