Shock: Active Storage for Multicloud Streaming Data Analysis

被引:2
|
作者
Bischof, Jared [1 ,2 ]
Wilke, Andreas [1 ,2 ]
Gerlach, Wolfgang [1 ,2 ]
Harrison, Travis [1 ,2 ]
Paczian, Tobias [1 ,2 ]
Tang, Wei [3 ]
Trimble, William [1 ,2 ]
Wilkening, Jared [4 ]
Desai, Narayan [5 ]
Meyer, Folker [1 ,2 ]
机构
[1] Argonne Natl Lab, Argonne, IL 60439 USA
[2] Univ Chicago, Chicago, IL 60637 USA
[3] Google Inc, Mountain View, CA USA
[4] Dramafever Inc, New York, NY USA
[5] Ericsson, San Jose, CA USA
关键词
bioinformatics; metagenomics; active object store; distributed wide-area computing;
D O I
10.1109/BDC.2015.40
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Access to data plays a major role in designing and performing efficient data computation and analyses in a distributed environment. Existing approaches access data via a variety of methods and offer various benefits and drawbacks based on the use case. Our original use case was the computational analysis of environmental sequence data, or metagenomics. Unlike other workflows that often reduce the dataset size dramatically within the first few processing steps, owing to biologially-motivated data compression. Metagenomic data compresses poorly, and so metagenomic workflows add to the size of the data set along the processing pipeline. Thus, wide-area, high-throughput access to the data is essential. To address this problem, we developed Shock, a data store for files, their associated metadata, and indexes that allow Shock to provide different views into the data. Shock comprises three major components: a web service that provides a RESTful API, backend data storage for files, and storage for object metadata. Shock has proven to be a stable data store for MG-RAST, an application that served over 40,000 users in 2014 on a server that houses more than 3 million data objects. Moreover, Shock provides both subselection and high-performance file transfer capabilities that serve most usages.
引用
收藏
页码:68 / 72
页数:5
相关论文
共 50 条
  • [1] Bacthing auditing of data in multicloud storage
    1600, Science and Engineering Research Support Society (07):
  • [2] Cooperative provable data possession scheme for multicloud storage
    Li, Z. (lizj@buaa.edu.cn), 1731, Tsinghua University (53):
  • [3] Analysis of the Comments on "Identity-Based Distributed Provable Data Possession in Multicloud Storage"
    Lan, Caihui
    Li, Haifeng
    Wang, Caifen
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2021, 14 (01) : 44 - 46
  • [4] Design And Development of Security Algorithm for Data Storage in Multicloud Environment
    Sivapriya, K.
    Kartheeban, K.
    2017 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT TECHNIQUES IN CONTROL, OPTIMIZATION AND SIGNAL PROCESSING (INCOS), 2017,
  • [5] Cooperative Provable Data Possession for Integrity Verification in Multicloud Storage
    Zhu, Yan
    Hu, Hongxin
    Ahn, Gail-Joon
    Yu, Mengyang
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (12) : 2231 - 2244
  • [6] A Secure and Effective Anonymous Integrity Checking Protocol for Data Storage in Multicloud
    Song, Lingwei
    Zhao, Dawei
    Chen, Xuebing
    Cao, Chenlei
    Niu, Xinxin
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [7] Identity-Based Distributed Provable Data Possession in Multicloud Storage
    Wang, Huaqun
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2015, 8 (02) : 328 - 340
  • [8] A Hierarchical Provable Massive Data Migration Method under Multicloud Storage
    Ma Haifeng
    Yu HaiTao
    Zhang Ji
    Wang Junhua
    Xue Qingshui
    Yang Jiahai
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [9] Ensuring Replication-based Data Integrity and Availability in Multicloud Storage
    Pei, Xin
    Lin, Jiuchuan
    Jin, Bo
    Wang, Yongjian
    2016 17TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2016, : 687 - 692
  • [10] On the Knowledge Soundness of a Cooperative Provable Data Possession Scheme in Multicloud Storage
    Wang, Huaqun
    Zhang, Yuqing
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (01) : 264 - 267