Distributed In Situ Processing of Big Raster Data in the Cloud

被引:3
|
作者
Zalipynis, Ramon Antonio Rodriges [1 ]
机构
[1] Natl Res Univ, Higher Sch Econ, Moscow, Russia
基金
俄罗斯基础研究基金会;
关键词
Big raster data; Climate reanalysis; Distributed systems; Cloud computing; SciDB; Array DBMS; In situ; NetCDF operators;
D O I
10.1007/978-3-319-74313-4_24
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A raster is the primary data type in Earth science, geology, remote sensing and other fields with tremendous growth of data volumes. An array DBMS is an option to tackle big raster data processing. However, raster data are traditionally stored in files, not in databases. Command line tools have long being developed to process raster files. Most tools are feature-rich and free but optimized for a single machine. This paper proposes new techniques for distributed processing of raster data directly in diverse file formats by delegating considerable portions of work to such tools. An N-dimensional array data model is proposed to maintain independence from the files and the tools. Also, a new scheme named GROUP-APPLY-FINALLY is presented to universally express the majority of raster data processing operations and streamline their distributed execution. New approaches make it possible to provide a rich collection of raster operations at scale and outperform SciDB over 410x on average on climate reanalysis data. SciDB is the only freely available distributed array DBMS to date. Experiments were carried out on 8- and 16-node clusters in Microsoft Azure Cloud.
引用
收藏
页码:337 / 351
页数:15
相关论文
共 50 条
  • [21] Cloud Computing Model for Big Geological Data Processing
    Song, Miaomiao
    Li, Zhe
    Zhou, Bin
    Li, Chaoling
    [J]. SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS II, PTS 1 AND 2, 2014, 475-476 : 306 - +
  • [22] Big Data Processing for Pervasive Environment in Cloud Computing
    Amato, Alba
    Di Martino, Beniamino
    Venticinque, Salvatore
    [J]. 2014 INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS), 2014, : 598 - 603
  • [23] Green Cloud Software Engineering for Big Data Processing
    Ganesan, Madhubala
    Kor, Ah-Lian
    Pattinson, Colin
    Rondeau, Eric
    [J]. SUSTAINABILITY, 2020, 12 (21) : 1 - 24
  • [24] A Big Data Processing Platform for Medical Records in Cloud
    Yang, Chao-Tung
    Liu, Jung-Chun
    Lu, Hsin-Wen
    Yan, Yin-Zhen
    Chu, Cheng-Chung
    [J]. INTELLIGENT SYSTEMS AND APPLICATIONS (ICS 2014), 2015, 274 : 1406 - 1415
  • [25] Big Data Processing and Access Controls in cloud Environment
    Reddy, Yenumula B.
    [J]. 2018 IEEE 4TH INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), 4THIEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, (HPSC) AND 3RD IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2018, : 25 - 33
  • [26] Distributed Join Query Processing for Big RDF Data
    Elzein, Nahla Mohammed
    Majid, Mazlina Abdul
    Fakherldin, Mohammed
    Hashem, Ibrahim Abaker Targio
    [J]. ADVANCED SCIENCE LETTERS, 2018, 24 (10) : 7758 - 7761
  • [27] Introduction to distributed and parallel processing of big spatiotemporal data
    Shang, Shuo
    He, Bingsheng
    Wang, Lizhe
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 151 : 98 - 99
  • [28] Distributed Evolutionary Feature Selection for Big Data Processing
    Bouaguel, Waad
    Ben NCir, Chiheb Eddine
    [J]. VIETNAM JOURNAL OF COMPUTER SCIENCE, 2022, 09 (03) : 313 - 332
  • [29] Big Data Distributed Storage and Processing Case Studies
    Islam, Tariqul
    Abid, Mehedi Hasan
    [J]. THIRD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND CAPSULE NETWORKS (ICIPCN 2022), 2022, 514 : 826 - 837
  • [30] Big Data Processing Technologies in Distributed Information Systems
    Shakhovska, Nataliya
    Boyko, Nataliya
    Zasoba, Yevgen
    Benova, Eleonora
    [J]. 10TH INT CONF ON EMERGING UBIQUITOUS SYST AND PERVAS NETWORKS (EUSPN-2019) / THE 9TH INT CONF ON CURRENT AND FUTURE TRENDS OF INFORMAT AND COMMUN TECHNOLOGIES IN HEALTHCARE (ICTH-2019) / AFFILIATED WORKOPS, 2019, 160 : 561 - 566