A Survey and Recommendations for Distributed, Parallel, Single Pass, Incremental Bayesian Classification based on MapReduce for Big Data

被引:0
|
作者
Shafiq, M. Omair [1 ]
Yang, Yibing [1 ]
Fekri, Maryam [1 ]
机构
[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
关键词
Distributed; Parallel; Single-pass; Incremental; Bayesian; Classification; Big Data;
D O I
10.1109/HPCCWS.2017.00013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the emerging digital age, massive production of data is occurred actively or passively by collecting data from users and environment via applications, sensor devices and so on. That makes it important and crucial to have the ability to process big data efficiently and effectively utilize it. The challenge to process big data is that it has high volume, velocity, variety, as well as veracity and value. In this paper, we present a survey of related work and prescribe our recommendations towards building Bayesian classification for big data environments. It is based on MapReduce and is distributed, parallel, single pass and incremental which makes it feasible to be deployed and executed on cloud computing platform We also carry out scalability analysis of the proposed solution that it can train Bayesian classifier to perform predictive analytics by processing big data with large volume, velocity and variety.
引用
收藏
页码:42 / 49
页数:8
相关论文
共 50 条
  • [41] Intelligent detection method for abnormal big data in heterogeneous networks based on Bayesian classification
    Liu, Ruijing
    Luo, Xiaoting
    WEB INTELLIGENCE, 2020, 18 (02) : 155 - 165
  • [42] Analysis of Bayesian optimization algorithms for big data classification based on Map Reduce framework
    Chitrakant Banchhor
    N. Srinivasu
    Journal of Big Data, 8
  • [43] Analysis of Bayesian optimization algorithms for big data classification based on Map Reduce framework
    Banchhor, Chitrakant
    Srinivasu, N.
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [44] ROBUST BAYESIAN INFERENCE FOR BIG DATA: COMBINING SENSOR-BASED RECORDS WITH TRADITIONAL SURVEY DATA
    Rafei, Ali
    Flannagan, Carol A. C.
    West, Brady T.
    Elliott, Michael R.
    ANNALS OF APPLIED STATISTICS, 2022, 16 (02): : 1038 - 1070
  • [45] An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment
    Abukhodair, Felwa
    Alsaggaf, Wafaa
    Jamal, Amani Tariq
    Abdel-Khalek, Sayed
    Mansour, Romany F.
    MATHEMATICS, 2021, 9 (20)
  • [46] A Massively Parallel Bayesian Approach to Factorization-Based Analysis of Big Time Series Data
    Gao T.
    Liu Y.
    Tang Y.
    Zhang L.
    Chen D.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (07): : 1567 - 1577
  • [47] Bayesian unsupervised classification framework based on stochastic partitions of data and a parallel search strategy
    Corander J.
    Gyllenberg M.
    Koski T.
    Adv. Data Anal. Classif., 2009, 1 (3-24): : 3 - 24
  • [48] Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems
    Yang, Hye-Kyung
    Yong, Hwan-Seung
    JOURNAL OF DATA AND INFORMATION SCIENCE, 2020, 5 (02) : 13 - 32
  • [49] Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems
    Hye-Kyung Yang
    Hwan-Seung Yong
    JournalofDataandInformationScience, 2020, 5 (02) : 13 - 32
  • [50] A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data
    Xia, Dawen
    Lu, Xiaonan
    Li, Huaqing
    Wang, Wendong
    Li, Yantao
    Zhang, Zili
    COMPLEXITY, 2018,