Multiclass classification of distributed memory parallel computations

被引:7
|
作者
Whalen, Sean [1 ]
Peisert, Sean [2 ,3 ]
Bishop, Matt [3 ]
机构
[1] Columbia Univ, Dept Comp Sci, New York, NY 10027 USA
[2] Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Berkeley, CA 94720 USA
[3] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
关键词
Multiclass classification; Bayesian networks; Random forests; Self-organizing maps; High performance computing; Communication patterns; NETWORK MOTIFS;
D O I
10.1016/j.patrec.2012.10.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High Performance Computing (HPC) is a field concerned with solving large-scale problems in science and engineering. However, the computational infrastructure of HPC systems can also be misused as demonstrated by the recent commoditization of cloud computing resources on the black market As a first step towards addressing this, we introduce a machine learning approach for classifying distributed parallel computations based on communication patterns between compute nodes. We first provide relevant background on message passing and computational equivalence classes called dwarfs and describe our exploratory data analysis using self organizing maps. We then present our classification results across 29 scientific codes using Bayesian networks and compare their performance against Random Forest classifiers. These models, trained with hundreds of gigabytes of communication logs collected at Lawrence Berkeley National Laboratory, perform well without any a priori information and address several shortcomings of previous approaches. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:322 / 329
页数:8
相关论文
共 50 条
  • [1] Building a global clock for observing computations in distributed memory parallel computers
    Jezequel, JM
    Jard, C
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1996, 8 (01): : 71 - 89
  • [2] Online Distributed Scheduling For Parallel Computations
    Narang, Ankur
    Srivastava, Abhinav
    Shyamasundar, R. K.
    HIGH PERFORMANCE COMPUTING SYMPOSIUM 2013 (HPC 2013) - 2013 SPRING SIMULATION MULTI-CONFERENCE (SPRINGSIM'13), 2013, 45 (06): : 83 - 90
  • [3] PROTOTYPING AND SIMULATING PARALLEL, DISTRIBUTED COMPUTATIONS
    DEMEURE, IM
    NUTT, GJ
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1994, 23 (01) : 1 - 9
  • [4] Distributed Scheduling of Parallel Hybrid Computations
    Agarwal, Shivali
    Narang, Ankur
    Shyamasundar, Rudrapatna K.
    ALGORITHMS AND COMPUTATION, PROCEEDINGS, 2009, 5878 : 1144 - +
  • [5] Distributed beagle:: An environment for parallel and distributed evolutionary computations
    Gagné, C
    Parizeau, M
    Dubreuil, M
    HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS, 2003, : 201 - 208
  • [6] Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems
    Li, Keqin
    JOURNAL OF SUPERCOMPUTING, 2010, 54 (03): : 271 - 297
  • [7] Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems
    Keqin Li
    The Journal of Supercomputing, 2010, 54 : 271 - 297
  • [8] An associative memory model based on multiclass classification
    Yagi, Y
    Tatsumi, K
    Tanino, T
    SICE 2004 ANNUAL CONFERENCE, VOLS 1-3, 2004, : 2532 - 2537
  • [9] AN EXPERIMENTAL MULTIPROCESSOR SYSTEM FOR DISTRIBUTED PARALLEL COMPUTATIONS
    DEMAEYER, L
    DINICOLA, A
    MAETCHE, R
    VONDERMALSBURG, C
    WISKOTT, L
    MICROPROCESSING AND MICROPROGRAMMING, 1990, 26 (05): : 305 - 317
  • [10] Separability to help parallel simulation of distributed computations
    Mauran, Philippe
    Padiou, Gerard
    Queinnec, Philippe
    PRINCIPLES OF DISTRIBUTED SYSTEMS, PROCEEDINGS, 2007, 4878 : 358 - 371