Run-Time Optimization for Learned Controllers Through Quantitative Games

被引：24

作者：

Avni, Guy ^{[1
]}

Bloem, Roderick ^{[2
]}

Chatterjee, Krishnendu ^{[1
]}

Henzinger, Thomas A. ^{[1
]}

Konighofer, Bettina ^{[2
]}

Pranger, Stefan ^{[2
]}

机构：

[1] IST Austria, Klosterneuburg, Austria

[2] Graz Univ Technol, Graz, Austria

来源：

COMPUTER AIDED VERIFICATION, CAV 2019, PT I | 2019年 / 11561卷

基金：

奥地利科学基金会;

关键词：

D O I：

10.1007/978-3-030-25540-4_36

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

A controller is a device that interacts with a plant. At each time point, it reads the plant's state and issues commands with the goal that the plant operates optimally. Constructing optimal controllers is a fundamental and challenging problem. Machine learning techniques have recently been successfully applied to train controllers, yet they have limitations. Learned controllers are monolithic and hard to reason about. In particular, it is difficult to add features without retraining, to guarantee any level of performance, and to achieve acceptable performance when encountering untrained scenarios. These limitations can be addressed by deploying quantitative run-time shields that serve as a proxy for the controller. At each time point, the shield reads the command issued by the controller and may choose to alter it before passing it on to the plant. We show how optimal shields that interfere as little as possible while guaranteeing a desired level of controller performance, can be generated systematically and automatically using reactive synthesis. First, we abstract the plant by building a stochastic model. Second, we consider the learned controller to be a black box. Third, we measure controller performance and shield interference by two quantitative run-time measures that are formally defined using weighted automata. Then, the problem of constructing a shield that guarantees maximal performance with minimal interference is the problem of finding an optimal strategy in a stochastic 2-player game "controller versus shield" played on the abstract state space of the plant with a quantitative objective obtained from combining the performance and interference measures. We illustrate the effectiveness of our approach by automatically constructing lightweight shields for learned traffic-light controllers in various road networks. The shields we generate avoid liveness bugs, improve controller performance in untrained and changing traffic situations, and add features to learned controllers, such as giving priority to emergency vehicles.

引用

下载

页码：630 / 649

页数：20

共 50 条

[21] RUN-TIME DEBUGGERS
NELSON, T
DR DOBBS JOURNAL, 1993, 18 (12): : 36 - 36
[22] Run-Time Power-Down Strategies for Real-Time SDRAM Memory Controllers
Chandrasekar, Karthik
Akesson, Benny
Goossens, Kees
2012 49TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2012, : 988 - 993
[23] Run-time verification
Colin, S
Mariani, L
MODEL-BASED TESTING OF REACTIVE SYSTEMS, 2005, 3472 : 525 - 555
[24] Enabling Run-Time Utility-Based Optimization through Generic Interfaces in Wireless Networks
Rerkrai, Krisakorn
Nasreddine, Jad
Riihijaervi, Janne
Maehoenen, Petri
2011 IEEE GLOBECOM WORKSHOPS (GC WKSHPS), 2011, : 518 - 523
[25] Robust Practical Binary Optimization at Run-time using LLVM
Engelke, Alexis
Schulz, Martin
PROCEEDINGS OF SIXTH WORKSHOP ON THE LLVM COMPILER INFRASTRUCTURE IN HPC AND WORKSHOP ON HIERARCHICAL PARALLELISM FOR EXASCALE COMPUTING (LLVM-HPC2020 AND HIPAR 2020), 2020, : 56 - 64
[26] Source-level optimization of run-time program generators
Kamin, S
Aktemur, B
Morton, P
GENERATIVE PROGRAMMING AND COMPONENT ENGINEERING, PROCEEDINGS, 2005, 3676 : 293 - 308
[27] Optimization of run-time management of data intensive Web sites
Florescu, D
Levy, A
Suciu, D
Yagoub, K
PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, 1999, : 627 - 638
[28] OPTIMIZATION OF PYROLYSIS FURNACE OPERATION INCLUDING RUN-TIME CONSTRAINT
RANZI, E
DENTE, M
PIERUCCI, S
BARENDREGT, S
CHIMICA & L INDUSTRIA, 1981, 63 (12): : 77 - 81
[29] Run-time parallelization switching for resource optimization on an MPSoC platform
Abbas, Naeem
Ma, Zhe
DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2014, 18 (3-4) : 279 - 293
[30] Run-Time Optimization of Heterogeneous Media Access in a Multimedia Server
To, Tsun-Ping J.
Hamidzadeh, Babak
IEEE TRANSACTIONS ON MULTIMEDIA, 2000, 2 (01) : 49 - 61

← 1 2 3 4 5 →