The globus compute dataset: An open function-as-a-service dataset from the edge to the cloud

被引:7
|
作者
Bauer, Andre [1 ,2 ]
Pan, Haochen [1 ]
Chard, Ryan [2 ]
Babuji, Yadu [1 ]
Bryan, Josh [1 ]
Tiwari, Devesh [3 ]
Foster, Ian [1 ,2 ]
Chard, Kyle [1 ,2 ]
机构
[1] Univ Chicago, Chicago, IL 60637 USA
[2] Argonne Natl Lab, Argonne, IL USA
[3] Northeastern Univ, Boston, MA 02138 USA
基金
美国国家科学基金会;
关键词
Serverless computing; Globus compute; FAIR dataset; Computing continuum;
D O I
10.1016/j.future.2023.12.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a unique function -as -a -service (FaaS) dataset capturing the use of the Globus Compute (previously funcX) platform. Globus Compute implements a federated model via which users may deploy endpoints on arbitrary remote computers, from the edge to high performance computing (HPC) cluster, and they may then invoke Python functions on those endpoints via a reliable cloud -hosted service. The dataset covers 31 weeks and includes 2121472 task submissions from 252 users executed on 580 remote computing endpoints. It includes 277386 registered functions. We describe the dataset and various observations, some that are similar to other FaaS datasets, for example, that 74% of tasks run for less than 1 s, and some that are unique to Globus Compute, for example, that endpoints are used in different ways and that the majority of functions are related to scientific computing and machine learning. To the best of our knowledge, this dataset represents the first federated FaaS dataset that includes user workloads, distributed computing endpoints, and analysis of registered function bodies. We expect the dataset to be useful for researching FaaS architectures, workload modeling, container warming, and other distributed computing architectures.
引用
收藏
页码:558 / 574
页数:17
相关论文
共 50 条
  • [31] Towards Edge-Precise Cloud and Shadow Detection on the GaoFen-1 Dataset: A Visual, Comprehensive Investigation
    Jiao, Libin
    Zheng, Mocun
    Tang, Ping
    Zhang, Zheng
    REMOTE SENSING, 2023, 15 (04)
  • [32] Segmentation of vestibular schwannoma from MRI, an open annotated dataset and baseline algorithm
    Jonathan Shapey
    Aaron Kujawa
    Reuben Dorent
    Guotai Wang
    Alexis Dimitriadis
    Diana Grishchuk
    Ian Paddick
    Neil Kitchen
    Robert Bradford
    Shakeel R. Saeed
    Sotirios Bisdas
    Sébastien Ourselin
    Tom Vercauteren
    Scientific Data, 8
  • [33] Optimized Placement of Service Function Chains in Edge Cloud with LSTM and ILP
    P. Vishesh
    K. Poorva
    H. Akshata
    B. Ritwik
    D. G. Narayan
    Sadaf Savanur
    SN Computer Science, 6 (1)
  • [34] Segmentation of vestibular schwannoma from MRI, an open annotated dataset and baseline algorithm
    Shapey, Jonathan
    Kujawa, Aaron
    Dorent, Reuben
    Wang, Guotai
    Dimitriadis, Alexis
    Grishchuk, Diana
    Paddick, Ian
    Kitchen, Neil
    Bradford, Robert
    Saeed, Shakeel R.
    Bisdas, Sotirios
    Ourselin, Sebastien
    Vercauteren, Tom
    SCIENTIFIC DATA, 2021, 8 (01)
  • [35] Open government data portals in the European Union: A dataset from 2015 to 2017
    de Juana-Espinosa, Susana
    Lujan-Mora, Sergio
    DATA IN BRIEF, 2020, 29
  • [36] Scattered Mountainous Area Building Extraction From an Open Satellite Imagery Dataset
    Deng, Shengsheng
    Wu, Shaolin
    Bian, Ang
    Zhang, Jianzhou
    Di, Baofeng
    Nienkotter, Andreas
    Deng, Tian
    Feng, Tao
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [37] NEBULA101: an open dataset for the study of language aptitude in behaviour, brain structure and function
    Rampinini, Alessandra
    Balboni, Irene
    Kepinska, Olga
    Berthele, Raphael
    Golestani, Narly
    SCIENTIFIC DATA, 2025, 12 (01)
  • [38] RibSeg Dataset and Strong Point Cloud Baselines for Rib Segmentation from CT Scans
    Yang, Jiancheng
    Gu, Shixuan
    Wei, Donglai
    Pfister, Hanspeter
    Ni, Bingbing
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 : 611 - 621
  • [39] A DATASET FOR INDIVIDUAL TREE DELINEATION FROM 3D POINT CLOUD DATA
    Song, Qian
    Zhu, Xiao Xiang
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 1369 - 1372
  • [40] A global gridded dataset for cloud vertical structure from combined CloudSat and CALIPSO observations
    Bertrand, Leah
    Kay, Jennifer E.
    Haynes, John
    de Boer, Gijs
    EARTH SYSTEM SCIENCE DATA, 2024, 16 (03) : 1301 - 1316