MDPFuzz: Testing Models Solving Markov Decision Processes

被引:10
|
作者
Pang, Qi [1 ]
Yuan, Yuanyuan [1 ]
Wang, Shuai [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
关键词
Deep learning testing; Markov decision procedure;
D O I
10.1145/3533767.3534388
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Markov decision process (MDP) provides a mathematical framework for modeling sequential decision-making problems, many of which are crucial to security and safety, such as autonomous driving and robot control. The rapid development of artificial intelligence research has created efficient methods for solving MDPs, such as deep neural networks (DNNs), reinforcement learning (RL), and imitation learning (IL). However, these popular models solving MDPs are neither thoroughly tested nor rigorously reliable. We present MDPFuzz, the first blackbox fuzz testing framework for models solving MDPs. MDPFuzz forms testing oracles by checking whether the target model enters abnormal and dangerous states. During fuzzing, MDPFuzz decides which mutated state to retain by measuring if it can reduce cumulative rewards or form a new state sequence. We design efficient techniques to quantify the "freshness" of a state sequence using Gaussian mixture models (GMMs) and dynamic expectation-maximization (DynEM). We also prioritize states with high potential of revealing crashes by estimating the local sensitivity of target models over states. MDPFuzz is evaluated on five state-of-the-art models for solving MDPs, including supervised DNN, RL, IL, and multi-agent RL. Our evaluation includes scenarios of autonomous driving, aircraft collision avoidance, and two games that are often used to benchmark RL. During a 12-hour run, we find over 80 crash-triggering state sequences on each model. We show inspiring findings that crash-triggering states, though they look normal, induce distinct neuron activation patterns compared with normal states. We further develop an abnormal behavior detector to harden all the evaluated models and repair them with the findings of MDPFuzz to significantly enhance their robustness without sacrificing accuracy.
引用
收藏
页码:378 / 390
页数:13
相关论文
共 50 条
  • [1] Decomposition methods for solving Markov decision processes with multiple models of the parameters
    Steimle, Lauren N.
    Ahluwalia, Vinayak S.
    Kamdar, Charmee
    Denton, Brian T.
    [J]. IISE TRANSACTIONS, 2021, 53 (12) : 1295 - 1310
  • [2] Solving concurrent Markov decision processes
    Weld, M
    Weld, DS
    [J]. PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 716 - 722
  • [3] Solving hybrid Markov decision processes
    Reyes, Alberto
    Sucar, L. Enrique
    Morales, Eduardo F.
    Ibarguengoytia, Pablo H.
    [J]. MICAI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4293 : 227 - +
  • [4] Efficient Model Solving for Markov Decision Processes
    Sapio, Adrian
    Bhattacharyya, Shuvra S.
    Wolf, Marilyn
    [J]. 2020 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2020, : 607 - 611
  • [5] Ordinal Decision Models for Markov Decision Processes
    Weng, Paul
    [J]. 20TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2012), 2012, 242 : 828 - 833
  • [6] Solving Markov Decision Processes with Downside Risk Adjustment
    Abhijit Gosavi
    Anish Parulekar
    [J]. Machine Intelligence Research, 2016, 13 (03) : 235 - 245
  • [7] Solving Markov decision processes with downside risk adjustment
    Gosavi A.
    Parulekar A.
    [J]. International Journal of Automation and Computing, 2016, 13 (3) : 235 - 245
  • [8] Solving transition independent decentralized Markov decision processes
    Becker, R
    Zilberstein, S
    Lesser, V
    Goldman, CV
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2004, 22 : 423 - 455
  • [9] Solving Markov Decision Processes with Partial State Abstractions
    Nashed, Samer B.
    Svegliato, Justin
    Brucato, Matteo
    Basich, Connor
    Grupen, Rod
    Zilberstein, Shlomo
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 813 - 819
  • [10] Evolutionary policy iteration for solving Markov decision processes
    Chang, HS
    Lee, HG
    Fu, MC
    Marcus, SI
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808