Big Data Processing with harnessing Hadoop - MapReduce for Optimizing Analytical Workloads

被引:0
|
作者
Satish, Rama K., V [1 ]
Kavya, N. P. [1 ,2 ]
机构
[1] RNS Inst Technol, Bengaluru, India
[2] RNS Inst Technol, Dept MCA, Bengaluru, India
关键词
Big data feature selection; firefly; classification; naive-bayes; Map reduce framework; SYSTEM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Now a days, we are living with social media data like heartbeat. The exponential growth with data first presented challenges to cutting-edge businesses such as Google, MSN, Flipkart, Microsoft, Facebook, Twitter, LinkedIn etc. Nevertheless, existing big data analytical models for hadoop comply with MapReduce analytical workloads that process a small segment of the whole data set, thus failing to assess the capabilities of the MapReduce model under heavy workloads that process exponentially accumulative data sizes.[1] In all social business and technical research applications, there is a need to process big data of data in efficient manner on normal uses data. In this paper, we have proposed an efficient technique to classify the big data from email using firefly and naive bayes classifier. Proposed technique is comprised into two phase, (i) Map reduce framework for training and (ii) Map reduce framework for testing. Initially, the input twitter data is given to the process to select the suitable feature for data classification. The traditional firefly algorithm is applied and the optimized feature space is adopted for the best fitting results. Once the best feature space is identified through firefly algorithm, the data classification is done using the naive bayes classifier. Here, these two processes are effectively distributed based on the concept given in Map-Reduce framework. The results of the experiment are validated using evaluation metrics namely, computation time, accuracy, specificity and sensitivity. For comparative analysis, proposed big data classification is compared with the existing works of naive bayes and neural network.
引用
收藏
页码:49 / 54
页数:6
相关论文
共 50 条
  • [1] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [2] Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads
    Chen, Yanpei
    Alspaugh, Sara
    Katz, Randy
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 1802 - 1813
  • [3] Big Data Management Processing with Hadoop MapReduce and Spark Technology: A Comparison
    Verma, Ankush
    Mansuri, Ashik Hussain
    Jain, Neelesh
    [J]. 2016 SYMPOSIUM ON COLOSSAL DATA ANALYSIS AND NETWORKING (CDAN), 2016,
  • [4] A Comparison of Big Remote Sensing Data Processing with Hadoop MapReduce and Spark
    Chebbi, I.
    Boulila, W.
    Mellouli, N.
    Lamolle, M.
    Farah, I. R.
    [J]. 2018 4TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2018,
  • [5] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [6] Architecture of Efficient Word Processing using Hadoop MapReduce for Big Data Applications
    Mandal, Bichitra
    Sahoo, Ramesh Kumar
    Sethi, Srinivas
    [J]. PROCEEDINGS 2015 INTERNATIONAL CONFERENCE ON MAN AND MACHINE INTERFACING (MAMI), 2015,
  • [7] Implementation of on-process aggregation for Efficient Big Data Processing in Hadoop MapReduce Environment
    Pol, Vidya V.
    Patil, S. M.
    [J]. 2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 445 - 449
  • [8] Performance Modelling and Analysis of MapReduce/Hadoop Workloads
    Yu, Xiaolong
    Li, Wei
    [J]. 2015 IEEE 21ST INTERNATIONAL WORKSHOP ON LOCAL & METROPOLITAN AREA NETWORKS (LANMAN), 2015,
  • [9] MREv: an Automatic MapReduce Evaluation Tool for Big Data Workloads
    Veiga, Jorge
    Exposito, Roberto R.
    Taboada, Guillermo L.
    Tourino, Juan
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 80 - 89
  • [10] An Approach to Enhance the Performance of Hadoop MapReduce Framework for Big Data
    Chandra, Subhash
    Motwani, Deepak
    [J]. 2016 INTERNATIONAL CONFERENCE ON MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING (ICMETE), 2016, : 178 - 182