Efficient clustered server-side data analysis workflows using SWAMP

被引:6
|
作者
Wang, Daniel L. [1 ,3 ]
Zender, Charles S. [2 ]
Jenks, Stephen F. [3 ]
机构
[1] SLAC Natl Accelerator Lab, Menlo Pk, CA USA
[2] Univ Calif Irvine, Dept Earth Syst Sci, Irvine, CA USA
[3] Univ Calif Irvine, Dept Elec Engn & Comp Sci, Irvine, CA USA
基金
美国国家科学基金会;
关键词
Data management; Geoscience; Parallel computing; Script compilation; NETCDF; TOOL;
D O I
10.1007/s12145-009-0021-z
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Technology continues to enable scientists to set new records in data collection and production, intensifying a need for large scale tools to efficiently process and analyze the growing mountain of data. To complement growth in the number of data centers and the volume of data they store, we introduce our Script Workflow Analysis for MultiProcessing (SWAMP) system. Our system provides safe server-side processing capabilities that allow scientists to reuse familiar desktop-based analysis methods represented in shell-scripts. Built-in script compilation isolates file accesses and generates workflows, while a cluster-capable execution engine partitions and executes the resulting workflow. Benchmarks illustrate up to 20X performance gains, as well as the importance of I/O considerations which make other computation systems less effective at geoscience data reduction.
引用
收藏
页码:141 / 155
页数:15
相关论文
共 50 条
  • [1] Efficient clustered server-side data analysis workflows using SWAMP
    Daniel L. Wang
    Charles S. Zender
    Stephen F. Jenks
    Earth Science Informatics, 2009, 2 : 141 - 155
  • [2] Server-side parallel data reduction and analysis
    Wang, Daniel L.
    Zender, Charles S.
    Jenks, Stephen F.
    ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2007, 4459 : 744 - +
  • [3] Using server-side includes
    Kruse, M
    DR DOBBS JOURNAL, 1996, 21 (02): : 52 - &
  • [4] Using server-side includes
    Dr Dobb's J Software Tools Prof Program, 2 (3pp):
  • [5] Server-Side Dynamic Code Analysis
    Guizani, Wadie
    Marion, Jean-Yves
    Reynaud-Plantey, Daniel
    2009 4TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE (MALWARE 2009), 2009, : 55 - 62
  • [6] On the validity of client-side vs server-side web log data analysis
    Yun, Gi Woong
    Ford, Jay
    Hawkins, Robert P.
    Pingree, Suzanne
    McTavish, Fiona
    Gustafson, David
    Berhe, Haile
    INTERNET RESEARCH, 2006, 16 (05) : 537 - 552
  • [7] Enabling geovisual analytics of health data using a server-side approach
    Turdukulov, Ulanbek
    Moncrieff, Simon
    CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE, 2016, 43 (01) : 16 - 29
  • [8] Data Extraction Formulation for Efficient Data Synchronization Between Mobile Databases and Server-Side Database
    Imam, Abdullahi Abubakar
    Basri, Shuib
    Ahmad, Rohiza
    ADVANCED SCIENCE LETTERS, 2018, 24 (02) : 1066 - 1070
  • [9] Efficiency Analysis of the Server-Side Numerical Computations
    Piorkowski, Adam
    Plodzien, Daniel
    COMPUTER NETWORKS, PROCEEDINGS, 2009, 39 : 225 - 232
  • [10] Designing an efficient and scalable server-side asynchrony model for CORBA
    Brunsch, D
    O'Ryan, C
    Schmidt, DC
    ACM SIGPLAN NOTICES, 2001, 36 (08) : 223 - 229