Run-Time Optimization for Learned Controllers Through Quantitative Games

被引:24
|
作者
Avni, Guy [1 ]
Bloem, Roderick [2 ]
Chatterjee, Krishnendu [1 ]
Henzinger, Thomas A. [1 ]
Konighofer, Bettina [2 ]
Pranger, Stefan [2 ]
机构
[1] IST Austria, Klosterneuburg, Austria
[2] Graz Univ Technol, Graz, Austria
来源
基金
奥地利科学基金会;
关键词
D O I
10.1007/978-3-030-25540-4_36
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A controller is a device that interacts with a plant. At each time point, it reads the plant's state and issues commands with the goal that the plant operates optimally. Constructing optimal controllers is a fundamental and challenging problem. Machine learning techniques have recently been successfully applied to train controllers, yet they have limitations. Learned controllers are monolithic and hard to reason about. In particular, it is difficult to add features without retraining, to guarantee any level of performance, and to achieve acceptable performance when encountering untrained scenarios. These limitations can be addressed by deploying quantitative run-time shields that serve as a proxy for the controller. At each time point, the shield reads the command issued by the controller and may choose to alter it before passing it on to the plant. We show how optimal shields that interfere as little as possible while guaranteeing a desired level of controller performance, can be generated systematically and automatically using reactive synthesis. First, we abstract the plant by building a stochastic model. Second, we consider the learned controller to be a black box. Third, we measure controller performance and shield interference by two quantitative run-time measures that are formally defined using weighted automata. Then, the problem of constructing a shield that guarantees maximal performance with minimal interference is the problem of finding an optimal strategy in a stochastic 2-player game "controller versus shield" played on the abstract state space of the plant with a quantitative objective obtained from combining the performance and interference measures. We illustrate the effectiveness of our approach by automatically constructing lightweight shields for learned traffic-light controllers in various road networks. The shields we generate avoid liveness bugs, improve controller performance in untrained and changing traffic situations, and add features to learned controllers, such as giving priority to emergency vehicles.
引用
收藏
页码:630 / 649
页数:20
相关论文
共 50 条
  • [31] Run-time parameter selection and tuning for energy optimization algorithms
    Mauser, Ingo
    Dorscheid, Marita
    Schmeck, Hartmut
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8672 : 80 - 89
  • [32] Sensing user context and habits for run-time energy optimization
    Draa, Ismat Chaib
    Niar, Smail
    Tayeb, Jamel
    Grislin, Emmanuelle
    Desertot, Mikael
    EURASIP JOURNAL ON EMBEDDED SYSTEMS, 2016, Springer International Publishing (2017)
  • [33] Run-Time Technique for Simultaneous Aging and Power Optimization in GPGPUs
    Chen, Xiaoming
    Wang, Yu
    Liang, Yun
    Xie, Yuan
    Yang, Huazhong
    2014 51ST ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2014,
  • [34] Run-Time Parameter Selection and Tuning for Energy Optimization Algorithms
    Mauser, Ingo
    Dorscheid, Marita
    Schmeck, Hartmut
    PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XIII, 2014, 8672 : 80 - 89
  • [35] Bounding the expected run-time of nonconvex optimization with early stopping
    Flynn, Thomas
    Yu, Kwang Min
    Malik, Abid
    D'Imperio, Nicolas
    Yoo, Shinjae
    CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 51 - 60
  • [36] Run-time parallelization switching for resource optimization on an MPSoC platform
    Naeem Abbas
    Zhe Ma
    Design Automation for Embedded Systems, 2014, 18 : 279 - 293
  • [37] Reliable Power Efficient Systems through Run-time Reconfiguration
    El-Araby, Nahla
    Jantsch, Axel
    2022 20TH IEEE INTERREGIONAL NEWCAS CONFERENCE (NEWCAS), 2022, : 347 - 351
  • [38] MPSoCs Run-Time Monitoring through Networks-on-Chip
    Fiorin, Leandro
    Palermo, Gianluca
    Silvano, Cristina
    DATE: 2009 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, VOLS 1-3, 2009, : 558 - +
  • [39] A quick safari through the MPSoC run-time management jungle
    Nollet, Vincent
    Verkest, Diederik
    Corporaal, Henk
    2007 IEEE/ACM/IFIP WORKSHOP ON EMBEDDED SYSTEMS FOR REAL-TIME MULTIMEDIA, 2007, : 41 - +
  • [40] Making garbage collection dependable through a run-time monitor
    Lo, CTD
    Proceedings from the Sixth Annual IEEE Systems, Man and Cybernetics Information Assurance Workshop, 2005, : 424 - 425