Big Data Processing with harnessing Hadoop - MapReduce for Optimizing Analytical Workloads

被引:0
|
作者
Satish, Rama K., V [1 ]
Kavya, N. P. [1 ,2 ]
机构
[1] RNS Inst Technol, Bengaluru, India
[2] RNS Inst Technol, Dept MCA, Bengaluru, India
关键词
Big data feature selection; firefly; classification; naive-bayes; Map reduce framework; SYSTEM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Now a days, we are living with social media data like heartbeat. The exponential growth with data first presented challenges to cutting-edge businesses such as Google, MSN, Flipkart, Microsoft, Facebook, Twitter, LinkedIn etc. Nevertheless, existing big data analytical models for hadoop comply with MapReduce analytical workloads that process a small segment of the whole data set, thus failing to assess the capabilities of the MapReduce model under heavy workloads that process exponentially accumulative data sizes.[1] In all social business and technical research applications, there is a need to process big data of data in efficient manner on normal uses data. In this paper, we have proposed an efficient technique to classify the big data from email using firefly and naive bayes classifier. Proposed technique is comprised into two phase, (i) Map reduce framework for training and (ii) Map reduce framework for testing. Initially, the input twitter data is given to the process to select the suitable feature for data classification. The traditional firefly algorithm is applied and the optimized feature space is adopted for the best fitting results. Once the best feature space is identified through firefly algorithm, the data classification is done using the naive bayes classifier. Here, these two processes are effectively distributed based on the concept given in Map-Reduce framework. The results of the experiment are validated using evaluation metrics namely, computation time, accuracy, specificity and sensitivity. For comparative analysis, proposed big data classification is compared with the existing works of naive bayes and neural network.
引用
收藏
页码:49 / 54
页数:6
相关论文
共 50 条
  • [31] Distributed Pattern Matching and Document Analysis in Big Data using Hadoop MapReduce Model
    Ramya, A., V
    Sivasankar, E.
    [J]. 2014 INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2014, : 312 - 317
  • [32] A Demonstration of ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data
    Alarabi, Louai
    Mokbel, Mohamed F.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (12): : 1961 - 1964
  • [33] Big Data Analytics:Predicting Academic Course Preference Using Hadoop Inspired MapReduce
    Guleria, Pratiyush
    Sood, Manu
    [J]. 2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 328 - 331
  • [34] Processing of Medical Different Types of Data Using Hadoop and Java']Java MapReduce
    Boyko, Nataliya
    Tkachuk, Nazar
    [J]. IDDM 2020: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON INFORMATICS & DATA-DRIVEN MEDICINE, 2020, 2753
  • [35] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
    Hussein, Eslam
    Sadiki, Ronewa
    Jafta, Yahlieel
    Sungay, Muhammad Mujahid
    Ajayi, Olasupo
    Bagula, Antoine
    [J]. E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
  • [36] Optimizing Cloud MapReduce for Processing Stream Data using Pipelining
    Karve, Rutvik
    Dahiphale, Devendra
    Chhajer, Amit
    [J]. UKSIM FIFTH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS 2011), 2011, : 344 - 349
  • [37] The Performance Optimization of Big Data Processing by Adaptive MapReduce Workflow
    Li, Wei
    Tang, Maolin
    [J]. IEEE ACCESS, 2022, 10 : 79004 - 79020
  • [38] Heterogeneous Architectures for Big Data Batch Processing in MapReduce Paradigm
    Goudarzi, Maziar
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (01) : 18 - 33
  • [39] Verifying Properties of MapReduce-Based Big Data Processing
    Zhang, Nan
    Wang, Meng
    Duan, Zhenhua
    Tian, Cong
    [J]. IEEE TRANSACTIONS ON RELIABILITY, 2022, 71 (01) : 321 - 338
  • [40] Big Data Processing with Probabilistic Latent Semantic Analysis on MapReduce
    Zhao, Yong
    Chen, Yao
    Liang, Zhao
    Yuan, Shuangshuang
    Li, Youfu
    [J]. 2014 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC), 2014, : 162 - 166