A Survey and Recommendations for Distributed, Parallel, Single Pass, Incremental Bayesian Classification based on MapReduce for Big Data

被引:0
|
作者
Shafiq, M. Omair [1 ]
Yang, Yibing [1 ]
Fekri, Maryam [1 ]
机构
[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
关键词
Distributed; Parallel; Single-pass; Incremental; Bayesian; Classification; Big Data;
D O I
10.1109/HPCCWS.2017.00013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the emerging digital age, massive production of data is occurred actively or passively by collecting data from users and environment via applications, sensor devices and so on. That makes it important and crucial to have the ability to process big data efficiently and effectively utilize it. The challenge to process big data is that it has high volume, velocity, variety, as well as veracity and value. In this paper, we present a survey of related work and prescribe our recommendations towards building Bayesian classification for big data environments. It is based on MapReduce and is distributed, parallel, single pass and incremental which makes it feasible to be deployed and executed on cloud computing platform We also carry out scalability analysis of the proposed solution that it can train Bayesian classifier to perform predictive analytics by processing big data with large volume, velocity and variety.
引用
收藏
页码:42 / 49
页数:8
相关论文
共 50 条
  • [31] A survey of distributed classification based ensemble data mining methods
    Mokeddem, D.
    Belbachir, H.
    Journal of Applied Sciences, 2009, 9 (20) : 3739 - 3745
  • [32] Parallel Fuzzy C-Means Clustering Based Big Data Anonymization Using Hadoop MapReduce
    Lawrance, Josephine Usha
    Jesudhasan, Jesu Vedha Nayahi
    Rittammal, Jerald Beno Thampiraj
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 135 (04) : 2103 - 2130
  • [33] EGRNN++ and PNN++ : Parallel and Distributed Neural Networks for Big Data Regression and Classification
    Kamaruddin S.
    Ravi V.
    SN Computer Science, 2021, 2 (2)
  • [34] Knowledge Extraction from Big Data using MapReduce-based Parallel-Reduct Algorithm
    Chowdhury, Tapan
    Chakraborty, Susanta
    Setua, S. K.
    PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2016, : 240 - 246
  • [35] Parallel and distributed processing for high resolution agricultural tomography based on big data
    Alves, Gabriel M.
    Cruvinel, Paulo E.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 10115 - 10146
  • [36] Parallel and distributed processing for high resolution agricultural tomography based on big data
    Gabriel M. Alves
    Paulo E. Cruvinel
    Multimedia Tools and Applications, 2024, 83 : 10115 - 10146
  • [37] PRIVACY-PRESERVING NAIVE BAYESIAN CLASSIFIER-BASED RECOMMENDATIONS ON DISTRIBUTED DATA
    Kaleli, Cihan
    Polat, Huseyin
    COMPUTATIONAL INTELLIGENCE, 2015, 31 (01) : 47 - 68
  • [38] Big Data Image Classification Based on Distributed Deep Representation Learning Model
    Zhu, Minjun
    Chen, Qinghua
    IEEE ACCESS, 2020, 8 : 133890 - 133904
  • [39] Big Data Image Classification Based on Distributed Deep Representation Learning Model
    Zhu, Minjun
    Chen, Qinghua
    IEEE Access, 2020, 8 : 133890 - 133904
  • [40] Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce
    Cao, Jianfang
    Cui, Hongyan
    Shi, Hao
    Jiao, Lijuan
    PLOS ONE, 2016, 11 (06):