Big Data Processing with harnessing Hadoop - MapReduce for Optimizing Analytical Workloads

被引:0
|
作者
Satish, Rama K., V [1 ]
Kavya, N. P. [1 ,2 ]
机构
[1] RNS Inst Technol, Bengaluru, India
[2] RNS Inst Technol, Dept MCA, Bengaluru, India
关键词
Big data feature selection; firefly; classification; naive-bayes; Map reduce framework; SYSTEM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Now a days, we are living with social media data like heartbeat. The exponential growth with data first presented challenges to cutting-edge businesses such as Google, MSN, Flipkart, Microsoft, Facebook, Twitter, LinkedIn etc. Nevertheless, existing big data analytical models for hadoop comply with MapReduce analytical workloads that process a small segment of the whole data set, thus failing to assess the capabilities of the MapReduce model under heavy workloads that process exponentially accumulative data sizes.[1] In all social business and technical research applications, there is a need to process big data of data in efficient manner on normal uses data. In this paper, we have proposed an efficient technique to classify the big data from email using firefly and naive bayes classifier. Proposed technique is comprised into two phase, (i) Map reduce framework for training and (ii) Map reduce framework for testing. Initially, the input twitter data is given to the process to select the suitable feature for data classification. The traditional firefly algorithm is applied and the optimized feature space is adopted for the best fitting results. Once the best feature space is identified through firefly algorithm, the data classification is done using the naive bayes classifier. Here, these two processes are effectively distributed based on the concept given in Map-Reduce framework. The results of the experiment are validated using evaluation metrics namely, computation time, accuracy, specificity and sensitivity. For comparative analysis, proposed big data classification is compared with the existing works of naive bayes and neural network.
引用
收藏
页码:49 / 54
页数:6
相关论文
共 50 条
  • [41] Processing of Big Educational Data in the Cloud Using Apache Hadoop
    Machova, Renata
    Komarkova, Jitka
    Lnenicka, Martin
    [J]. INTERNATIONAL CONFERENCE ON INFORMATION SOCIETY (I-SOCIETY 2016), 2016, : 46 - 49
  • [42] Managing and Optimizing Big Data Workloads for On-Demand User Centric Reports
    Baicoianu, Alexandra
    Scheianu, Ion Valentin
    [J]. BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (02)
  • [43] CodHoop: A System for Optimizing Big Data Processing
    Asad, Zakia
    Chaudhry, Mohammad Asad Rehman
    Malone, David
    [J]. 2015 9TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON), 2015, : 295 - 300
  • [44] Mathematical Methods for Optimizing Big Data Processing
    Syrotkina, Olena
    Aleksieiev, Mykhailo
    Moroz, Borys
    Matsiuk, Serhii
    Shevtsova, Olga
    Kozlovskyi, Andrii
    [J]. 2020 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER INFORMATION TECHNOLOGIES (ACIT), 2020, : 170 - 176
  • [45] An Analytical Approach to Evaluation of SSD Effects under MapReduce Workloads
    Ahn, Sungyong
    Park, Sangkyu
    [J]. JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, 2015, 15 (05) : 511 - 518
  • [46] A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop
    Pal, Amrit
    Agrawal, Pinki
    Jain, Kunal
    Agrawal, Sanjay
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 587 - 591
  • [47] Efficient Storage and Processing of Video Data for Moving Object Detection Using Hadoop/MapReduce
    Parsola, Jyoti
    Gangodkar, Durgaprasad
    Mittal, Ankush
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL, NETWORKS, COMPUTING, AND SYSTEMS (ICSNCS 2016), VOL 1, 2017, 395 : 137 - 147
  • [48] CloudFinder: A System for Processing Big Data Workloads on Volunteered Federated Clouds
    Rezgui, Abdelmounaam
    Davis, Nickolas
    Malik, Zaki
    Medjahed, Brahim
    Soliman, Hamdy S.
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (02) : 347 - 358
  • [49] Clustering of Association Rules for Big Datasets using Hadoop MapReduce
    Moahmmed, Salahadin A.
    Alasow, Mohamed A.
    El-Alfy, El-Sayed M.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 536 - 545
  • [50] HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
    Abouzeid, Azza
    Bajda-Pawlikowski, Kamil
    Abadi, Daniel
    Silberschatz, Avi
    Rasin, Alexander
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (01):