END-TO-END PROCESS ORCHESTRATION OF EARTH OBSERVATION DATA WORKFLOWS WITH APACHE AIRFLOW ON HIGH PERFORMANCE COMPUTING

被引:3
|
作者
Tian, Liang [1 ]
Sedona, Rocco [1 ,2 ]
Mozaffari, Amirpasha [2 ]
Kreshpa, Enxhi [2 ]
Paris, Claudia [3 ]
Riedel, Morris [1 ,2 ]
Schultz, Martin G. [2 ]
Cavallaro, Gabriele [1 ,2 ]
机构
[1] Univ Iceland, Sch Engn & Nat Sci, IS-107 Reykjavik, Iceland
[2] Forschungszentrum Julich, Julich Supercomp Ctr, D-52428 Julich, Germany
[3] Univ Twente, NL-7514 AE Enschede, Netherlands
基金
欧盟地平线“2020”;
关键词
Workflows; Deep Learning (DL); High-Performance Computing (HPC); remote sensing data;
D O I
10.1109/IGARSS52108.2023.10283416
中图分类号
P [天文学、地球科学];
学科分类号
07 ;
摘要
Earth Observation (EO) data processing faces challenges due to large volumes, multiple sources, and diverse formats. To address this issue, this paper presents a scalable and parallelizable workflow using Apache Airflow, capable of integrating Machine Learning (ML) and Deep Learning (DL) models with Modular Supercomputing Architecture (MSA) systems. To test the workflow, we considered the production of large-scale Land-Cover (LC) maps as a case study. The workflow manager, Airflow, offers scalability, extensibility, and programmable task definition in Python. It allows us to execute different steps of the workflow in different High-Performance Computing (HPC) systems. The workflow is demonstrated on the Dynamical Exascale Entry Platform (DEEP) and J <spacing diaeresis>ulich Research on Exascale Cluster Architectures (JURECA) hosted at the J <spacing diaeresis>ulich Supercomputing Centre (JSC), a platform that incorporates heterogeneous JSC systems.
引用
收藏
页码:711 / 714
页数:4
相关论文
共 50 条
  • [41] A Platform of Scientific Workflows for Orchestration of Parallel Components in a Cloud of High Performance Computing Applications
    Silva, Jefferson de Carvalho
    de Carvalho Junior, Francisco Heron
    PROGRAMMING LANGUAGES (SBLP 2016), 2016, 9889 : 156 - 170
  • [42] Towards Data-driven Simulation of End-to-end Network Performance Indicators
    Sliwa, Benjamin
    Wietfeld, Christian
    2019 IEEE 90TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2019-FALL), 2019,
  • [43] End-to-end strategy to enable high throughput process development of conjugate vaccines
    Daniels, Chris
    Wang, Sheng-ching
    Bowman, Amy
    Wen, Emily
    Varma, Misha
    Svab, Thomas
    Dieter, Lance
    McHugh, Pat
    Winters, Michael
    Wenger, Marc
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2014, 247
  • [44] Deep Clustering and Visualization for End-to-End High-Dimensional Data Analysis
    Wu, Lirong
    Yuan, Lifan
    Zhao, Guojiang
    Lin, Haitao
    Li, Stan Z.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (11) : 8543 - 8554
  • [45] An End-to-End Security Architecture to Collect, Process and Share Wearable Medical Device Data
    Rohloff, Kurt
    Polyakov, Yuriy
    2015 17TH INTERNATIONAL CONFERENCE ON E-HEALTH NETWORKING, APPLICATION & SERVICES (HEALTHCOM), 2015, : 615 - 620
  • [46] Building a High Performance End-to-End Explicit Discourse Parser for Practical Application
    Wang, Jianxiang
    Lan, Man
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2015, 2015, 9403 : 324 - 335
  • [47] A High-Performance Neural Network SoC for End-to-End Speaker Verification
    Tsai, Tsung-Han
    Chiang, Meng-Jui
    IEEE Access, 2024, 12 : 165482 - 165496
  • [48] A Workflow-based Network Advisor for Data Movement with End-to-end Performance Optimization
    Brown, Patrick
    Zhu, Mengxia
    Wu, Qishi
    Yun, Daqing
    Zurawski, Jason
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 73 - 81
  • [49] Missing the Forest for the Trees: End-to-End AI Application Performance in Edge Data Centers
    Richins, Daniel
    Doshi, Dharmisha
    Blackmore, Matthew
    Nair, Aswathy Thulaseedharan
    Pathapati, Neha
    Patel, Ankit
    Daguman, Brainard
    Dobrijalowski, Daniel
    Illikkal, Ramesh
    Long, Kevin
    Zimmerman, David
    Reddi, Vijay Janapa
    2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 515 - 528
  • [50] Optimization and automation of an end-to-end high throughput microscale transient protein production process
    Bos, Aaron B.
    Luan, Peng
    Duque, Joseph N.
    Reilly, Dorothea
    Harms, Peter D.
    Wong, Athena W.
    BIOTECHNOLOGY AND BIOENGINEERING, 2015, 112 (09) : 1832 - 1842