High-performance IoT streaming data prediction system using Spark: a case study of air pollution

被引:0
|
作者
Ho-Yong Jin
Eun-Sung Jung
Duckki Lee
机构
[1] Hongik University,Department of Software and Communications Engineering
[2] Yonam Institute of Technology,Department of Smart Software
来源
关键词
Long Short-Term Memory (LSTM); Distributed deep learning; Distributed Keras (Dist-Keras); Apache Spark;
D O I
暂无
中图分类号
学科分类号
摘要
Internet-of-Things (IoT) devices are becoming prevalent, and some of them, such as sensors, generate continuous time-series data, i.e., streaming data. These IoT streaming data are one of Big Data sources, and they require careful consideration for efficient data processing and analysis. Deep learning is emerging as a solution to IoT streaming data analytics. However, there is a persistent problem in deep learning that it takes a long time to learn neural networks. In this paper, we propose a high-performance IoT streaming data prediction system to improve the learning speed and to predict in real time. We showed the efficacy of the system through a case study of air pollution. The experimental results show that the modified LSTM autoencoder model shows the best performance compared to a generic LSTM model. We noticed that achieving the best performance requires optimizing many parameters, including learning rate, epoch, memory cell size, input timestep size, and the number of features/predictors. In that regard, we show that the high-performance data learning/prediction frameworks (e.g., Spark, Dist-Keras, and Hadoop) are essential to rapidly fine-tune a model for training and testing before real deployment of the model as data accumulate.
引用
收藏
页码:13147 / 13154
页数:7
相关论文
共 50 条
  • [41] Prediction of Air Pollution Concentration Using Weather Data and Regression Models
    Trenchevski, Aleksandar
    Kalendar, Marija
    Gjoreski, Hristijan
    Efnusheva, Danijela
    PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON APPLIED INNOVATIONS IN IT, 2020, 8 (01): : 55 - 61
  • [42] A Smart IoT Urban Flood Monitoring System Using a High-Performance Pressure Sensor with LoRaWAN
    Department of Mechanical Engineering, De La Salle University, 2401 Taft Avenue, Manila
    0922, Philippines
    HighTech. Innov. J., 2024, 4 (918-936):
  • [43] Graph Learning Techniques Using Structured Data for IoT Air Pollution Monitoring Platforms
    Ferrer-Cid, Pau
    Barcelo-Ordinas, Jose M.
    Garcia-Vidal, Jorge
    IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (17) : 13652 - 13663
  • [44] High-performance air cooling system for telecommunications MCMs
    Kaneko, Yasuo
    Kishimoto, Tohru
    Harada, Akio
    NTT R and D, 1994, 43 (05): : 544 - 556
  • [45] Utilizing Crowdsourced Data for Studies of Cycling and Air Pollution Exposure: A Case Study Using Strava Data
    Sun, Yeran
    Mobasheri, Amin
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2017, 14 (03)
  • [46] High-performance scientific data management system
    No, JC
    Thakur, R
    Choudhary, A
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2003, 63 (04) : 434 - 447
  • [47] Slipstream: High-Performance Lossless Compression for Streaming Synchronized Waveform Monitoring Data
    Blair, Steven
    Costello, Jason
    2022 INTERNATIONAL CONFERENCE ON SMART GRID SYNCHRONIZED MEASUREMENTS AND ANALYTICS - SGSMA 2022, 2022,
  • [48] Scalable, High-Performance, and Generalized Subtree Data Anonymization Approach for Apache Spark
    Bazai, Sibghat Ullah
    Jang-Jaccard, Julian
    Alavizadeh, Hooman
    ELECTRONICS, 2021, 10 (05) : 1 - 28
  • [49] Air pollution prediction with machine learning: a case study of Indian cities
    Kumar, K.
    Pande, B. P.
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL SCIENCE AND TECHNOLOGY, 2023, 20 (05) : 5333 - 5348
  • [50] Air pollution prediction with machine learning: a case study of Indian cities
    K. Kumar
    B. P. Pande
    International Journal of Environmental Science and Technology, 2023, 20 : 5333 - 5348