Efficient clustered server-side data analysis workflows using SWAMP

被引:6
|
作者
Wang, Daniel L. [1 ,3 ]
Zender, Charles S. [2 ]
Jenks, Stephen F. [3 ]
机构
[1] SLAC Natl Accelerator Lab, Menlo Pk, CA USA
[2] Univ Calif Irvine, Dept Earth Syst Sci, Irvine, CA USA
[3] Univ Calif Irvine, Dept Elec Engn & Comp Sci, Irvine, CA USA
基金
美国国家科学基金会;
关键词
Data management; Geoscience; Parallel computing; Script compilation; NETCDF; TOOL;
D O I
10.1007/s12145-009-0021-z
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Technology continues to enable scientists to set new records in data collection and production, intensifying a need for large scale tools to efficiently process and analyze the growing mountain of data. To complement growth in the number of data centers and the volume of data they store, we introduce our Script Workflow Analysis for MultiProcessing (SWAMP) system. Our system provides safe server-side processing capabilities that allow scientists to reuse familiar desktop-based analysis methods represented in shell-scripts. Built-in script compilation isolates file accesses and generates workflows, while a cluster-capable execution engine partitions and executes the resulting workflow. Benchmarks illustrate up to 20X performance gains, as well as the importance of I/O considerations which make other computation systems less effective at geoscience data reduction.
引用
收藏
页码:141 / 155
页数:15
相关论文
共 50 条
  • [21] Server-Side Image Segmentation and Patient-Related Data Storage
    Virag, Ioan
    Stoicu-Tivadar, Lacramioara
    Crisan-Vida, Mihaela
    Amaricai, Elena
    SOFT COMPUTING APPLICATIONS, (SOFA 2014), VOL 1, 2016, 356 : 259 - 266
  • [22] Client-side versus server-side geographic data processing performance comparison: Data and code
    Kulawiak, Marcin
    DATA IN BRIEF, 2019, 26
  • [23] Data Synchronization Model for Heterogeneous Mobile Databases and Server-side Database
    Imam, Abdullahi Abubakar
    Basri, Shuib
    Ahmad, Rohiza
    Gilal, Abdul Rehman
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (01) : 521 - 531
  • [24] A Hybrid Approach to Detect Injection Attacks on Server-side Applications using Data Mining Techniques
    Ahmed, Abu Syeed Sajid
    Shachi, Mehjabeen
    Brishty, Afsana Afrin
    Siddiqui, Nurnaby
    Sakib, Nazmus
    2021 3RD INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR INDUSTRY 4.0 (STI), 2021,
  • [25] Efficient Compression for Server-Side G-Buffer Streaming in Web Applications
    Raesch, Sascha
    Herz, Maximilian
    Behr, Johannes
    Kuijper, Arjan
    WEB3D 2017, 2017,
  • [26] Server-side Efficient Parity Generation for Cluster-wide RAID System
    Ohtsuji, Hiroki
    Tatebe, Osamu
    2015 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2015, : 444 - 447
  • [27] Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection
    Xu, Yang
    Wang, Lun
    Xu, Hongli
    Liu, Jianchun
    Wang, Zhiyuan
    Huang, Liusheng
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (04) : 2813 - 2831
  • [28] Finding Server-Side Endpoints with Static Analysis of Client-Side Java']JavaScript
    Sigalov, Daniil
    Gamayunov, Dennis
    COMPUTER SECURITY. ESORICS 2023 INTERNATIONAL WORKSHOPS, CPS4CIP, PT II, 2024, 14399 : 442 - 458
  • [29] Server-side Prediction of Source IP Addresses using Density Estimation
    Goldstein, Markus
    Reif, Matthias
    Stahl, Armin
    Breuel, Thomas
    2009 INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY, AND SECURITY (ARES), VOLS 1 AND 2, 2009, : 82 - 89
  • [30] HSM-Based Architecture to Detect Insider Attacks on Server-Side Data
    Dib, Marc
    Pierre, Samuel
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 2538 - 2549