A Survey and Recommendations for Distributed, Parallel, Single Pass, Incremental Bayesian Classification based on MapReduce for Big Data

被引：0

作者：

Shafiq, M. Omair ^{[1
]}

Yang, Yibing ^{[1
]}

Fekri, Maryam ^{[1
]}

机构：

[1] Carleton Univ, Sch Informat Technol, Ottawa, ON, Canada

来源：

2017 IEEE 19TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS WORKSHOPS (HPCCWS): MULTICORE AND MULTITHREADED ARCHITECTURES AND ALGORITHMS (M2A2 2017) | 2017年

关键词：

Distributed; Parallel; Single-pass; Incremental; Bayesian; Classification; Big Data;

D O I：

10.1109/HPCCWS.2017.00013

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the emerging digital age, massive production of data is occurred actively or passively by collecting data from users and environment via applications, sensor devices and so on. That makes it important and crucial to have the ability to process big data efficiently and effectively utilize it. The challenge to process big data is that it has high volume, velocity, variety, as well as veracity and value. In this paper, we present a survey of related work and prescribe our recommendations towards building Bayesian classification for big data environments. It is based on MapReduce and is distributed, parallel, single pass and incremental which makes it feasible to be deployed and executed on cloud computing platform We also carry out scalability analysis of the proposed solution that it can train Bayesian classifier to perform predictive analytics by processing big data with large volume, velocity and variety.

引用

页码：42 / 49

页数：8

共 50 条

[21] K-Means Parallel Algorithm of Big Data Clustering Based on Mapreduce PCAM Method
Li, Yongyi
Yang, Zhongqiang
Han, Kaixu
Engineering Intelligent Systems, 2021, 29 (06): : 411 - 418
[22] MapReduce-based parallel GEP algorithm for efficient function mining in big data applications
Liu, Yang
Ma, Chenxiao
Xu, Lixiong
Shen, Xiaodong
Li, Maozhen
Li, Pengcheng
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (23):
[23] A review on big data based parallel and distributed approaches of pattern mining
Kumar, Sunil
Mohbey, Krishna Kumar
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (05) : 1639 - 1662
[24] Distributed Big Data Clustering using MapReduce-based Fuzzy C-Medoids
Sardar T.H.
Ansari Z.
Journal of The Institution of Engineers (India): Series B, 2022, 103 (01) : 73 - 82
[25] Resource and Cost Aware Glowworm Mapreduce Optimization Based Big Data Processing in Geo Distributed Data Center
Nithyanantham, S.
Singaravel, G.
WIRELESS PERSONAL COMMUNICATIONS, 2021, 117 (04) : 2831 - 2852
[26] Resource and Cost Aware Glowworm Mapreduce Optimization Based Big Data Processing in Geo Distributed Data Center
S. Nithyanantham
G. Singaravel
Wireless Personal Communications, 2021, 117 : 2831 - 2852
[27] A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules
Sara del Río
Victoria López
José Manuel Benítez
Francisco Herrera
International Journal of Computational Intelligence Systems, 2015, 8 : 422 - 437
[28] Improved KD-tree based imbalanced big data classification and oversampling for MapReduce platforms
Sleeman, William C.
Roseberry, Martha
Ghosh, Preetam
Cano, Alberto
Krawczyk, Bartosz
APPLIED INTELLIGENCE, 2024, 54 (23) : 12558 - 12575
[29] Big data classification using heterogeneous ensemble classifiers in Apache Spark based on MapReduce paradigm
Kadkhodaei, Hamidreza
Moghadam, Amir Masoud Eftekhari
Dehghan, Mehdi
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
[30] A MapReduce Approach to Address Big Data Classification Problems Based on the Fusion of Linguistic Fuzzy Rules
del Rio, Sara
Lopez, Victoria
Manuel Benitez, Jose
Herrera, Francisco
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2015, 8 (03) : 422 - 437

← 1 2 3 4 5 →