END-TO-END PROCESS ORCHESTRATION OF EARTH OBSERVATION DATA WORKFLOWS WITH APACHE AIRFLOW ON HIGH PERFORMANCE COMPUTING

被引:3
|
作者
Tian, Liang [1 ]
Sedona, Rocco [1 ,2 ]
Mozaffari, Amirpasha [2 ]
Kreshpa, Enxhi [2 ]
Paris, Claudia [3 ]
Riedel, Morris [1 ,2 ]
Schultz, Martin G. [2 ]
Cavallaro, Gabriele [1 ,2 ]
机构
[1] Univ Iceland, Sch Engn & Nat Sci, IS-107 Reykjavik, Iceland
[2] Forschungszentrum Julich, Julich Supercomp Ctr, D-52428 Julich, Germany
[3] Univ Twente, NL-7514 AE Enschede, Netherlands
基金
欧盟地平线“2020”;
关键词
Workflows; Deep Learning (DL); High-Performance Computing (HPC); remote sensing data;
D O I
10.1109/IGARSS52108.2023.10283416
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Earth Observation (EO) data processing faces challenges due to large volumes, multiple sources, and diverse formats. To address this issue, this paper presents a scalable and parallelizable workflow using Apache Airflow, capable of integrating Machine Learning (ML) and Deep Learning (DL) models with Modular Supercomputing Architecture (MSA) systems. To test the workflow, we considered the production of large-scale Land-Cover (LC) maps as a case study. The workflow manager, Airflow, offers scalability, extensibility, and programmable task definition in Python. It allows us to execute different steps of the workflow in different High-Performance Computing (HPC) systems. The workflow is demonstrated on the Dynamical Exascale Entry Platform (DEEP) and J <spacing diaeresis>ulich Research on Exascale Cluster Architectures (JURECA) hosted at the J <spacing diaeresis>ulich Supercomputing Centre (JSC), a platform that incorporates heterogeneous JSC systems.
引用
收藏
页码:711 / 714
页数:4
相关论文
共 50 条
  • [11] Data-based description of process performance in end-to-end order processing
    Schuh, Günther
    Gützlaff, Andreas
    Schmitz, Seth
    van der Aalst, Wil M.P.
    CIRP Annals, 2020, 69 (01): : 381 - 384
  • [12] A New End-to-End Flow-Control Mechanism for High Performance Computing Clusters
    Prades, Javier
    Silla, Federico
    Duato, Jose
    Froening, Holger
    Nuessle, Mondrian
    2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 320 - 328
  • [13] Improved End-to-End Data Security Approach for Cloud Computing
    Ghosh, Soumalya
    Verma, Shiv Kumar
    Ghosh, Uttam
    Al-Numay, Mohammed
    SUSTAINABILITY, 2023, 15 (22)
  • [14] HIGH QUALITY END-TO-END LINK PERFORMANCE
    Wuebben, Dirk
    IEEE VEHICULAR TECHNOLOGY MAGAZINE, 2009, 4 (03): : 26 - 32
  • [15] Optimizing end-to-end performance of data-intensive computing pipelines in heterogeneous network environments
    Wu, Qishi
    Gu, Yi
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2011, 71 (02) : 254 - 265
  • [16] High-Performance End-to-End Integrity Verification on Big Data Transfer
    Jung, Eun-Sung
    Liu, Si
    Kettimuthu, Rajkumar
    Chung, Sungwook
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (08) : 1478 - 1488
  • [17] End-to-end trustworthy data access in data-oriented scientific computing
    Pallickara, Sangmi Lee
    Plale, Beth
    Fang, Liang
    Gannon, Dennis
    SIXTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID: SPANNING THE WORLD AND BEYOND, 2006, : 395 - +
  • [18] DLBooster: Boosting End-to-End Deep Learning Workflows with Offloading Data Preprocessing Pipelines
    Cheng, Yang
    Li, Dan
    Guo, Zhiyuan
    Jiang, Binyao
    Lin, Jiaxin
    Fan, Xi
    Geng, Jinkun
    Yu, Xinyi
    Bai, Wei
    Qu, Lei
    Shu, Ran
    Cheng, Peng
    Xiong, Yongqiang
    Wu, Jianping
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [19] End-to-end high performance mobility without infrastructure
    Davu, Sandeep
    Zaghal, Raid Y.
    Khan, Javed I.
    2005 IEEE INTERNATIONAL CONFERENCE ON ELECTRO/INFORMATION TECHNOLOGY (EIT 2005), 2005, : 302 - 309
  • [20] Accelerating and Expanding End-to-End Data Science Workflows with DL/ML Interoperability Using RAPIDS
    Richardson, Bartley
    Rees, Bradley
    Drabas, Tom
    Oldridge, Even
    Bader, David A.
    Allen, Rachel
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3503 - 3504