In-Mapper combiner based MapReduce algorithm for processing of big climate data

被引:33
|
作者
Manogaran, Gunasekaran [1 ]
Lopez, Daphne [2 ]
Chilamkurti, Naveen [3 ]
机构
[1] Univ Calif Davis, Davis, CA 95616 USA
[2] VIT Univ, Sch Informat Technol & Engn, Vellore, Tamil Nadu, India
[3] La Trobe Univ, Dept Comp Sci & Comp Engn, Melbourne, Vic, Australia
关键词
Big data; Internet of Things; Weather sensor devices; MapReduce programming; Model; Hadoop distributed file system; SOFTWARE ARCHITECTURE; AUTHENTICATION; FRAMEWORK; MODEL; CHALLENGES; ANALYTICS; SYSTEM; SCHEME;
D O I
10.1016/j.future.2018.02.048
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Big data refers to a collection of massive volume of data that cannot be processed by conventional data processing tools and technologies. In recent years, the data production sources are enlarged noticeably, such as high-end streaming devices, wireless sensor networks, satellite, wearable Internet of Things (IoT) devices. These data generation sources generate a massive volume of data in a continuous manner. The large volume of climate data is collected from the IoT weather sensor devices and NCEP. In this paper, the big data processing framework is proposed to integrate climate and health data and to find the correlation between the climate parameters and incidence of dengue. This framework is demonstrated with the help of MapReduce programming model, Hive, HBase and ArcGIS in a Hadoop Distributed File System (HDFS) environment. The following weather parameters such as minimum temperature, maximum temperature, wind, precipitation, solar and relative humidity are collected for the study are Tamil Nadu with the help of IoT weather sensor devices and NCEP. Proposed framework focuses only on climate data for 32 districts of Tamil Nadu where each district contains 1,57,680 rows and so there are 50,45,760 rows in total. Batch view precomputation for the monthly mean of various climate parameters would require 50,45,760 rows. Hence, this would create more latency in query processing. In order to overcome this issue, batch views can precompute for a smaller number of records and involve more computation to be done at query time. The In-Mapper based MapReduce framework is used to compute the monthly mean of climate parameter for each latitude and longitude. The experimental results prove the effectiveness of the response time for the In-Mapper based combiner algorithm is less when compared with the existing MapReduce algorithm. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:433 / 445
页数:13
相关论文
共 50 条
  • [1] A spatiotemporal indexing approach for efficient processing of big array-based climate data with MapReduce
    Li, Zhenlong
    Hu, Fei
    Schnase, John L.
    Duffy, Daniel Q.
    Lee, Tsengdar
    Bowen, Michael K.
    Yang, Chaowei
    [J]. INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2017, 31 (01) : 17 - 35
  • [2] Prominence of MapReduce in BIG DATA Processing
    Pandey, Shweta
    Tokekar, Vrinda
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 555 - 560
  • [3] Verifying Properties of MapReduce-Based Big Data Processing
    Zhang, Nan
    Wang, Meng
    Duan, Zhenhua
    Tian, Cong
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
  • [4] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [5] Trust-Based Scheduling Framework for Big Data Processing with MapReduce
    Thanh Dat Dang
    Doan Hoang
    Nguyen, Diep N.
    [J]. IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (01) : 279 - 293
  • [6] Big Data Prediction Framework for Weather Temperature Based on MapReduce Algorithm
    Ismail, Khalid Adam
    Majid, Mazlina Abdul
    Zain, Jasni Mohamed
    Abu Bakar, Noor Akma
    [J]. 2016 IEEE CONFERENCE ON OPEN SYSTEMS, 2016, : 13 - 17
  • [7] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    [J]. IAENG International Journal of Applied Mathematics, 2023, 53 (01)
  • [8] A Top-k Query Algorithm for Big Data Based on MapReduce
    Lin, Xueyan
    [J]. PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE, 2015, : 982 - 985
  • [9] Analysis of the Big Data based on MapReduce
    Tian, Zi-de
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 224 - 228
  • [10] High-Performance Geospatial Big Data Processing System Based on MapReduce
    Jo, Junghee
    Lee, Kang-Woo
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2018, 7 (10):