A Survey and Recommendations for Distributed, Parallel, Single Pass, Incremental Bayesian Classification based on MapReduce for Big Data

被引:0
|
作者
Shafiq, M. Omair [1 ]
Yang, Yibing [1 ]
Fekri, Maryam [1 ]
机构
[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada
关键词
Distributed; Parallel; Single-pass; Incremental; Bayesian; Classification; Big Data;
D O I
10.1109/HPCCWS.2017.00013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the emerging digital age, massive production of data is occurred actively or passively by collecting data from users and environment via applications, sensor devices and so on. That makes it important and crucial to have the ability to process big data efficiently and effectively utilize it. The challenge to process big data is that it has high volume, velocity, variety, as well as veracity and value. In this paper, we present a survey of related work and prescribe our recommendations towards building Bayesian classification for big data environments. It is based on MapReduce and is distributed, parallel, single pass and incremental which makes it feasible to be deployed and executed on cloud computing platform We also carry out scalability analysis of the proposed solution that it can train Bayesian classifier to perform predictive analytics by processing big data with large volume, velocity and variety.
引用
收藏
页码:42 / 49
页数:8
相关论文
共 50 条
  • [1] Big data mining with parallel computing: A comparison of distributed and MapReduce methodologies
    Tsai, Chih-Fong
    Lin, Wei-Chao
    Ke, Shih-Wen
    JOURNAL OF SYSTEMS AND SOFTWARE, 2016, 122 : 83 - 92
  • [2] A Survey on Geographically Distributed Big-Data Processing Using MapReduce
    Dolev, Shlomi
    Florissi, Patricia
    Gudes, Ehud
    Sharma, Shantanu
    Singer, Ido
    IEEE TRANSACTIONS ON BIG DATA, 2019, 5 (01) : 60 - 80
  • [3] A Distributed Framework for Predictive Analytics Using Big Data and MapReduce Parallel Programming
    Natesan P.
    Sathishkumar V.E.
    Mathivanan S.K.
    Venkatasen M.
    Jayagopal P.
    Allayear S.M.
    Mathematical Problems in Engineering, 2023, 2023
  • [4] MapReduce based Classification for Fault Detection in Big Data Applications
    Shafiq, M. Omair
    Fekri, Maryam
    Ibrahim, Rami
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 637 - 642
  • [5] Parallel Associative Classification Data Mining Frameworks Based MapReduce
    Thabtah, Fadi
    Hammoud, Suhel
    Abdel-Jaber, Hussein
    PARALLEL PROCESSING LETTERS, 2015, 25 (02)
  • [6] A Survey of Distributed and Parallel Extreme Learning Machine for Big Data
    Wang, Zhiqiong
    Sui, Ling
    Xin, Junchang
    Qu, Luxuan
    Yao, Yudong
    IEEE ACCESS, 2020, 8 : 201247 - 201258
  • [7] Parallel Clustering Optimization Algorithm Based on MapReduce in Big Data Mining
    Zhang, Huajie
    Song, Lei
    Zhang, Sen
    IAENG International Journal of Applied Mathematics, 2023, 53 (01):
  • [8] Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud
    Vennila, V.
    Kannan, A. Rajiv
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2019, 21 (03) : 809 - 822
  • [9] Hybrid Parallel Linguistic Fuzzy Rules with Canopy MapReduce for Big Data Classification in Cloud
    V. Vennila
    A. Rajiv Kannan
    International Journal of Fuzzy Systems, 2019, 21 : 809 - 822
  • [10] Incremental Attribute Reduction Method for Electric Power Big Data Based on MapReduce Framework
    Liao H.
    Teng H.
    Lu G.
    Dianli Xitong Zidonghua/Automation of Electric Power Systems, 2019, 43 (15): : 186 - 192