SciSpark: Applying In-memory Distributed Computing to Weather Event Detection and Tracking

被引:0
|
作者
Palamuttam, Rahul [1 ,3 ]
Mogrovejo, Renato Marroquin [1 ,4 ]
Mattmann, Chris [1 ,2 ]
Wilson, Brian [1 ]
Whitehall, Kim [1 ]
Verma, Rishi [1 ]
McGibbney, Lewis [1 ]
Ramirez, Paul [1 ]
机构
[1] CALTECH, NASA, Jet Prop Lab, Pasadena, CA 91125 USA
[2] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[3] Univ Calif San Diego, La Jolla, CA USA
[4] ETH Univ, Zurich, Switzerland
关键词
Apache Spark; in-memory distributed computing; large scientific datasets; mesoscale convective complexes;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present SciSpark, a Big Data framework that extends Apache (TM) Spark for scaling scientific computations. The paper details the initial architecture and design of SciSpark. We demonstrate how SciSpark achieves parallel ingesting and partitioning of earth science satellite and model datasets. We also illustrate the usability and extensibility of SciSpark by implementing aspects of the Grab 'em Tag 'em Graph 'em (GTG) algorithm using SciSpark and its Map Reduce capabilities. GTG is a topical automated method for identifying and tracking Mesoscale Convective Complexes in satellite infrared datasets.
引用
收藏
页码:2020 / 2026
页数:7
相关论文
共 31 条
  • [1] In-Memory Computing Architectures for Sparse Distributed Memory
    Kang, Mingu
    Shanbhag, Naresh R.
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2016, 10 (04) : 855 - 863
  • [2] Distributed In-Memory Computing on Binary RRAM Crossbar
    Ni, Leibin
    Huang, Hantao
    Liu, Zichuan
    Joshi, Rajiv V.
    Yu, Hao
    [J]. ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)
  • [3] Weather data analysis using Spark - An In-memory Computing framework
    Jayanthi, D.
    Sumathi, G.
    [J]. 2017 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2017,
  • [4] DERIV: Distributed In-memory Brand Perception Tracking Framework
    Shukla, Manu
    Fong, Andrew
    Dos Santos, Raimundo
    Lu, Chang-Tien
    [J]. 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 387 - 393
  • [5] Accelerating Event Processing for Security Analytics on a Distributed In-Memory Platform
    Jaeger, David
    Cheng, Feng
    Meinel, Christoph
    [J]. 2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, : 634 - 643
  • [6] Optimization Techniques for a Distributed In-Memory Computing Platform by Leveraging SSD
    Choi, June
    Lee, Jaehyun
    Kim, Jik-Soo
    Lee, Jaehwan
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [7] IMFSSC: An In-Memory Distributed File System Framework for Super Computing
    Li, Binyang
    Li, Bo
    Liu, Ming
    [J]. 2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD), 2016, : 132 - 137
  • [8] Load Balancing Scheme for Effectively Supporting Distributed In-Memory Based Computing
    Bok, Kyoungsoo
    Choi, Kitae
    Choi, Dojin
    Lim, Jongtae
    Yoo, Jaesoo
    [J]. ELECTRONICS, 2019, 8 (05)
  • [9] Design and implementation of reconfigurable acceleration for in-memory distributed big data computing
    Hou, Junjie
    Zhu, Yongxin
    Du, Sen
    Song, Shijin
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 92 : 68 - 75
  • [10] ClimateSpark: An in-memory distributed computing framework for big climate data analytics
    Hu, Fei
    Yang, Chaowei
    Schnase, John L.
    Duffy, Daniel Q.
    Xu, Mengchao
    Bowen, Michael K.
    Lee, Tsengdar
    Song, Weiwei
    [J]. COMPUTERS & GEOSCIENCES, 2018, 115 : 154 - 166