Run-Time Optimization for Learned Controllers Through Quantitative Games

被引:24
|
作者
Avni, Guy [1 ]
Bloem, Roderick [2 ]
Chatterjee, Krishnendu [1 ]
Henzinger, Thomas A. [1 ]
Konighofer, Bettina [2 ]
Pranger, Stefan [2 ]
机构
[1] IST Austria, Klosterneuburg, Austria
[2] Graz Univ Technol, Graz, Austria
来源
基金
奥地利科学基金会;
关键词
D O I
10.1007/978-3-030-25540-4_36
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A controller is a device that interacts with a plant. At each time point, it reads the plant's state and issues commands with the goal that the plant operates optimally. Constructing optimal controllers is a fundamental and challenging problem. Machine learning techniques have recently been successfully applied to train controllers, yet they have limitations. Learned controllers are monolithic and hard to reason about. In particular, it is difficult to add features without retraining, to guarantee any level of performance, and to achieve acceptable performance when encountering untrained scenarios. These limitations can be addressed by deploying quantitative run-time shields that serve as a proxy for the controller. At each time point, the shield reads the command issued by the controller and may choose to alter it before passing it on to the plant. We show how optimal shields that interfere as little as possible while guaranteeing a desired level of controller performance, can be generated systematically and automatically using reactive synthesis. First, we abstract the plant by building a stochastic model. Second, we consider the learned controller to be a black box. Third, we measure controller performance and shield interference by two quantitative run-time measures that are formally defined using weighted automata. Then, the problem of constructing a shield that guarantees maximal performance with minimal interference is the problem of finding an optimal strategy in a stochastic 2-player game "controller versus shield" played on the abstract state space of the plant with a quantitative objective obtained from combining the performance and interference measures. We illustrate the effectiveness of our approach by automatically constructing lightweight shields for learned traffic-light controllers in various road networks. The shields we generate avoid liveness bugs, improve controller performance in untrained and changing traffic situations, and add features to learned controllers, such as giving priority to emergency vehicles.
引用
收藏
页码:630 / 649
页数:20
相关论文
共 50 条
  • [1] Run-time integration of robotic manipulators and their controllers
    Proctor, Frederick M.
    Scrapper, Christopher J.
    Balaguer, Benjamin
    [J]. PROCEEDINGS OF THE ASME INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION 2007, VOL 3: DESIGN AND MANUFACTURING, 2008, : 229 - 235
  • [2] ROX: Run-time Optimization of XQueries
    Kader, Riham Abdel
    Boncz, Peter
    Manegold, Stefan
    van Keulen, Maurice
    [J]. ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 615 - 626
  • [3] Run-time memory optimization for DDMB architecture through a CCB algorithm
    Cho, Jeonghun
    Paek, Yunheung
    [J]. EMERGING DIRECTIONS IN EMBEDDED AND UBIQUITOUS COMPUTING, 2006, 4097 : 775 - 784
  • [4] Run-time reconfiguration of FPGA-based drive controllers
    Schulz, B.
    Paiz, C.
    Hagemeyer, J.
    Mathapati, S.
    Porrmann, M.
    Boecker, J.
    [J]. 2007 EUROPEAN CONFERENCE ON POWER ELECTRONICS AND APPLICATIONS, VOLS 1-10, 2007, : 4648 - +
  • [5] ON THE RUN-TIME OPTIMIZATION OF THE BOOLEAN LOGIC OF A PROGRAM
    CADOLINO, C
    GUAZZO, M
    [J]. INFORMATION PROCESSING & MANAGEMENT, 1982, 18 (05) : 267 - 279
  • [6] ROS: Run-Time Optimization of SPARQL Queries
    Li, Liuqing
    Wang, Xin
    Meng, Xiansen
    Feng, Zhiyong
    [J]. WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 79 - 86
  • [7] Run-time spatial locality detection and optimization
    Johnson, TL
    Merten, MC
    Hwu, WW
    [J]. THIRTIETH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1997, : 57 - 64
  • [8] RUN-TIME OPTIMIZATION OF A DYNAMICALLY RECONFIGURABLE EMBEDDED SYSTEM THROUGH PERFORMANCE PREDICTION
    Mariani, Giovanni
    Sima, Vlad-Mihai
    Palermo, Gianluca
    Zaccaria, Vittorio
    Marchiori, Giacomo
    Silvano, Cristina
    Bertels, Koen
    [J]. 2013 23RD INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL 2013) PROCEEDINGS, 2013,
  • [9] Self-Optimization of DHT Lookups through Run-Time Performance Analysis
    Juenemann, Konrad
    Hartenstein, Hannes
    [J]. 2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2014, : 407 - 415
  • [10] Survivability Through Run-Time Software Evolution
    Simmons, Sharon
    Edwards, Dennis
    [J]. 2009 8TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS, 2009, : 108 - 113