Big Data Processing with harnessing Hadoop - MapReduce for Optimizing Analytical Workloads

被引：0

作者：

Satish, Rama K., V ^{[1
]}

Kavya, N. P. ^{[1
,2
]}

机构：

[1] RNS Inst Technol, Bengaluru, India

[2] RNS Inst Technol, Dept MCA, Bengaluru, India

来源：

2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I) | 2014年

关键词：

Big data feature selection; firefly; classification; naive-bayes; Map reduce framework; SYSTEM;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Now a days, we are living with social media data like heartbeat. The exponential growth with data first presented challenges to cutting-edge businesses such as Google, MSN, Flipkart, Microsoft, Facebook, Twitter, LinkedIn etc. Nevertheless, existing big data analytical models for hadoop comply with MapReduce analytical workloads that process a small segment of the whole data set, thus failing to assess the capabilities of the MapReduce model under heavy workloads that process exponentially accumulative data sizes.[1] In all social business and technical research applications, there is a need to process big data of data in efficient manner on normal uses data. In this paper, we have proposed an efficient technique to classify the big data from email using firefly and naive bayes classifier. Proposed technique is comprised into two phase, (i) Map reduce framework for training and (ii) Map reduce framework for testing. Initially, the input twitter data is given to the process to select the suitable feature for data classification. The traditional firefly algorithm is applied and the optimized feature space is adopted for the best fitting results. Once the best feature space is identified through firefly algorithm, the data classification is done using the naive bayes classifier. Here, these two processes are effectively distributed based on the concept given in Map-Reduce framework. The results of the experiment are validated using evaluation metrics namely, computation time, accuracy, specificity and sensitivity. For comparative analysis, proposed big data classification is compared with the existing works of naive bayes and neural network.

引用

页码：49 / 54

页数：6

共 50 条

[41] Processing of Big Educational Data in the Cloud Using Apache Hadoop
Machova, Renata
Komarkova, Jitka
Lnenicka, Martin
[J]. INTERNATIONAL CONFERENCE ON INFORMATION SOCIETY (I-SOCIETY 2016), 2016, : 46 - 49
[42] Managing and Optimizing Big Data Workloads for On-Demand User Centric Reports
Baicoianu, Alexandra
Scheianu, Ion Valentin
[J]. BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (02)
[43] CodHoop: A System for Optimizing Big Data Processing
Asad, Zakia
Chaudhry, Mohammad Asad Rehman
Malone, David
[J]. 2015 9TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE (SYSCON), 2015, : 295 - 300
[44] Mathematical Methods for Optimizing Big Data Processing
Syrotkina, Olena
Aleksieiev, Mykhailo
Moroz, Borys
Matsiuk, Serhii
Shevtsova, Olga
Kozlovskyi, Andrii
[J]. 2020 10TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER INFORMATION TECHNOLOGIES (ACIT), 2020, : 170 - 176
[45] An Analytical Approach to Evaluation of SSD Effects under MapReduce Workloads
Ahn, Sungyong
Park, Sangkyu
[J]. JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, 2015, 15 (05) : 511 - 518
[46] A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop
Pal, Amrit
Agrawal, Pinki
Jain, Kunal
Agrawal, Sanjay
[J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 587 - 591
[47] Efficient Storage and Processing of Video Data for Moving Object Detection Using Hadoop/MapReduce
Parsola, Jyoti
Gangodkar, Durgaprasad
Mittal, Ankush
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL, NETWORKS, COMPUTING, AND SYSTEMS (ICSNCS 2016), VOL 1, 2017, 395 : 137 - 147
[48] CloudFinder: A System for Processing Big Data Workloads on Volunteered Federated Clouds
Rezgui, Abdelmounaam
Davis, Nickolas
Malik, Zaki
Medjahed, Brahim
Soliman, Hamdy S.
[J]. IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (02) : 347 - 358
[49] Clustering of Association Rules for Big Datasets using Hadoop MapReduce
Moahmmed, Salahadin A.
Alasow, Mohamed A.
El-Alfy, El-Sayed M.
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 536 - 545
[50] HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
Abouzeid, Azza
Bajda-Pawlikowski, Kamil
Abadi, Daniel
Silberschatz, Avi
Rasin, Alexander
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (01):

← 1 2 3 4 5 →