High-performance IoT streaming data prediction system using Spark: a case study of air pollution

被引:0
|
作者
Ho-Yong Jin
Eun-Sung Jung
Duckki Lee
机构
[1] Hongik University,Department of Software and Communications Engineering
[2] Yonam Institute of Technology,Department of Smart Software
来源
关键词
Long Short-Term Memory (LSTM); Distributed deep learning; Distributed Keras (Dist-Keras); Apache Spark;
D O I
暂无
中图分类号
学科分类号
摘要
Internet-of-Things (IoT) devices are becoming prevalent, and some of them, such as sensors, generate continuous time-series data, i.e., streaming data. These IoT streaming data are one of Big Data sources, and they require careful consideration for efficient data processing and analysis. Deep learning is emerging as a solution to IoT streaming data analytics. However, there is a persistent problem in deep learning that it takes a long time to learn neural networks. In this paper, we propose a high-performance IoT streaming data prediction system to improve the learning speed and to predict in real time. We showed the efficacy of the system through a case study of air pollution. The experimental results show that the modified LSTM autoencoder model shows the best performance compared to a generic LSTM model. We noticed that achieving the best performance requires optimizing many parameters, including learning rate, epoch, memory cell size, input timestep size, and the number of features/predictors. In that regard, we show that the high-performance data learning/prediction frameworks (e.g., Spark, Dist-Keras, and Hadoop) are essential to rapidly fine-tune a model for training and testing before real deployment of the model as data accumulate.
引用
收藏
页码:13147 / 13154
页数:7
相关论文
共 50 条
  • [1] High-performance IoT streaming data prediction system using Spark: a case study of air pollution
    Jin, Ho-Yong
    Jung, Eun-Sung
    Lee, Duckki
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (17): : 13147 - 13154
  • [2] Air pollution monitoring and prediction using IoT
    Ayele, Temesegan Walelign
    Mehta, Rutvik
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2018, : 1741 - 1745
  • [3] Research on High-Performance Real-time Data Analysis System Based on Spark Streaming in Big Data Environment
    Wang, Jialin
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 140 - 141
  • [4] An IoT based efficient Air pollution prediction system using DLMNN classifier
    Nemade, Bhushankumar
    Shah, Deven
    PHYSICS AND CHEMISTRY OF THE EARTH, 2022, 128
  • [5] A High-Performance Implementation of an IoT System Using DPDK
    Pak, JuGeon
    Park, KeeHyun
    APPLIED SCIENCES-BASEL, 2018, 8 (04):
  • [6] Urban Pollution Environmental Monitoring System Using IoT Devices and Data Visualization: A Case Study
    Rosero-Montalvo, Paul D.
    Lopez-Batista, Vivian F.
    Peluffo-Ordonez, Diego H.
    Lorente-Leyva, Leandro L.
    Blanco-Valencia, X. P.
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 686 - 696
  • [7] Preparing input data for sensitivity analysis of an air pollution model by using high-performance supercomputers and algorithms
    Ostromsky, Tzvetan
    Dimov, Ivan
    Alexandrov, Vassil
    Zlatev, Zahari
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2015, 70 (11) : 2773 - 2782
  • [8] An IoT System for Air Pollution Monitoring with Safe Data Transmission
    Bobulski, Janusz
    Szymoniak, Sabina
    Pasternak, Kamila
    SENSORS, 2024, 24 (02)
  • [9] Self Configurable Air Pollution Monitoring System Using IoT and Data Mining Techniques
    Binsy, M. S.
    Sampath, Nalini
    INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 786 - 798
  • [10] Real-time pneumonia prediction using pipelined spark and high-performance computing
    Ravikumar, Aswathy
    Sriraman, Harini
    PEERJ COMPUTER SCIENCE, 2023, 9