Randomized Testing of Distributed Systems with Probabilistic Guarantees

被引:15
|
作者
Ozkan, Burcu Kulahcioglu [1 ]
Majumdar, Rupak [1 ]
Niksic, Filip [1 ]
Befrouei, Mitra Tabaei [2 ]
Weissenbacher, Georg [2 ]
机构
[1] Max Planck Inst Software Syst MPI SWS, Paul Ehrlich Str 26, D-67663 Rheinland Pfalz, Germany
[2] Vienna Univ Technol, Vienna, Austria
基金
欧洲研究理事会; 奥地利科学基金会;
关键词
distributed systems; random testing; hitting families; partially ordered sets; online chain partitioning;
D O I
10.1145/3276530
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Several recently proposed randomized testing tools for concurrent and distributed systems come with theoretical guarantees on their success. The key to these guarantees is a notion of bug depthDthe minimum length of a sequence of events sufficient to expose the bugDand a characterization of d-hitting families of schedulesDa set of schedules guaranteed to cover every bug of given depth d. Previous results show that in certain cases the size of a d-hitting family can be significantly smaller than the total number of possible schedules. However, these results either assume shared-memory multithreading, or that the underlying partial ordering of events is known statically and has special structure. These assumptions are not met by distributed message-passing applications. In this paper, we present a randomized scheduling algorithm for testing distributed systems. In contrast to previous approaches, our algorithm works for arbitrary partially ordered sets of events revealed online as the program is being executed. We show that for partial orders of width at most w and size at most n (both statically unknown), our algorithm is guaranteed to sample from at most w(2)n(d-1) schedules, for every fixed bug depth d. Thus, our algorithm discovers a bug of depth d with probability at least 1/(w(2)n(d-1)). As a special case, our algorithm recovers a previous randomized testing algorithm for multithreaded programs. Our algorithm is simple to implement, but the correctness arguments depend on difficult combinatorial results about online dimension and online chain partitioning of partially ordered sets. We have implemented our algorithm in a randomized testing tool for distributed message-passing programs. We show that our algorithm can find bugs in distributed systems such as Zookeeper and Cassandra, and empirically outperforms naive random exploration while providing theoretical guarantees.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] Testing Probabilistic Distributed Systems
    Hierons, Robert M.
    Nunez, Manuel
    [J]. FORMAL TECHNIQUES FOR DISTRIBUTED SYSTEMS, PROCEEDINGS, 2010, 6117 : 63 - +
  • [2] Towards Overcoming Issues of Testing Probabilistic Distributed systems
    Tajioue, Mohammed Amine
    Maakoul, Oussama
    Hsaini, Sara
    Azzouzi, Salma
    Charaf, My El Hassan
    [J]. 2020 7TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'20), VOL 1, 2020, : 903 - 907
  • [3] A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs
    Burckhardt, Sebastian
    Kothari, Pravesh
    Musuvathi, Madanlal
    Nagarakatte, Santosh
    [J]. ASPLOS XV: FIFTEENTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2010, : 167 - 178
  • [4] Radius Aware Probabilistic Testing of Deadlocks with Guarantees
    Cai, Yan
    Yang, Zijiang
    [J]. 2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 356 - 367
  • [5] A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs
    Burckhardt, Sebastian
    Kothari, Pravesh
    Musuvathi, Madanlal
    Nagarakatte, Santosh
    [J]. ACM SIGPLAN NOTICES, 2010, 45 (03) : 167 - 178
  • [6] Probabilistic Scheduling Guarantees in Distributed Real-Time Systems under Error Bursts
    Aysan, Huseyin
    Dobrin, Radu
    Punnekkat, Sasikumar
    Proenza, Julian
    [J]. 2012 IEEE 17TH CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION (ETFA), 2012,
  • [7] Probabilistic QoS guarantees for supercomputing systems
    Oliner, AJ
    Rudolph, L
    Sahoo, RK
    Moreira, JE
    Gupta, M
    [J]. 2005 INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2005, : 634 - 643
  • [8] Probabilistic Performance Guarantees for Distributed Self-Assembly
    Fox, Michael J.
    Shamma, Jeff S.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2015, 60 (12) : 3180 - 3194
  • [9] Probabilistic testing for a distributed conference protocol
    Goga, N
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 5154 - 5158
  • [10] Testing Self-Adaptive Software with Probabilistic Guarantees on Performance Metrics
    Mandrioli, Claudio
    Maggio, Martina
    [J]. PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20), 2020, : 1002 - 1014